Questions: RAID Degraded Rebuild Latency¶
- What is the current state of the RAID array, and how far along is the rebuild?
- Which specific drive failed and which is the replacement? Are there additional drives showing early failure signs?
- What RAID level is in use and how many drives can the array tolerate losing during rebuild?
- What is the current rebuild speed, and can it be tuned (stripe_cache_size, speed_limit_min/max)?
- Is the I/O scheduler appropriate for this workload (e.g.,
deadlinevscfqvsnone)? - Can the database workload be shifted or throttled during the rebuild window?
- Are there dmesg errors indicating additional disk issues beyond the replaced drive?
- Is there a hot spare configured, and was the rebuild triggered automatically or manually?
- What is the estimated time to rebuild completion, and what is the risk if another drive fails?
- Has the rebuild speed been capped by kernel defaults that are too aggressive or too conservative?