datacenter
l2
deep-dive
raid
storage --- Portal | Level: L2: Operations | Topics: RAID, Storage (SAN/NAS/DAS) | Domain: Datacenter & Hardware

RAID and Storage Internals¶

Scope¶

This document explains RAID and adjacent Linux storage concepts for operational understanding:

RAID levels and tradeoffs
md RAID basics
parity concepts
rebuild behavior
degraded arrays
write hole intuition
cache/log concepts
dm-raid relationship
practical troubleshooting questions

Reference anchors: - https://docs.kernel.org/admin-guide/md.html - https://docs.kernel.org/driver-api/md/raid5-cache.html - https://docs.kernel.org/driver-api/md/raid5-ppl.html - https://docs.kernel.org/admin-guide/device-mapper/dm-raid.html

Big Picture¶

RAID is about combining multiple block devices to improve one or more of:

redundancy
capacity aggregation
read performance
write performance
failure tolerance

It is not magic. It is just tradeoffs plus math plus recovery pain.

The First Truth About RAID¶

RAID is not backup.

Redundancy protects against some device failures. It does not protect against: - deletion - corruption replicated across members - ransomware - operator error - fire/theft/site loss - many software-layer mistakes

Anyone who answers "we have RAID so we are backed up" should be launched into the sun.

RAID Levels - Mental Models¶

RAID 0¶

Striping only. - performance/capacity - no redundancy - one disk dies, array dies

RAID 1¶

Mirroring. - redundancy - simple mental model - read benefits possible - write roughly duplicated

RAID 5¶

Striping + distributed parity. - one-disk fault tolerance - parity write overhead - degraded/rebuild stress matters

RAID 6¶

Like RAID 5 but dual parity. - two-disk fault tolerance - more parity overhead - safer for large arrays than RAID 5 in many cases

RAID 10¶

Striped mirrors. - performance + redundancy - good operational reputation - capacity cost higher than parity arrays

Why Parity RAID Is Trickier¶

Parity RAID must keep data and parity consistent.

For writes, the array often needs to: - read old data/parity as needed - compute new parity - write updated data/parity

This is where write penalties and consistency hazards come from.

Mirrors are conceptually simpler. Parity arrays are math plus timing plus pain.

Rebuilds and Degraded Mode¶

When an array is degraded: - redundancy margin is reduced - performance often worsens - remaining disks are stressed harder - another failure can escalate into disaster depending on RAID level

Rebuilds are not free. They generate load, time, risk, and anxiety.

This is why "RAID 5 is fine, one disk failed, no big deal" can age badly on large slow arrays.

The RAID5 Write Hole¶

The write hole is a classic parity-array problem: if power loss or crash occurs after some but not all data/parity updates land, parity consistency can be wrong.

That can lead to silent corruption scenarios during degraded operation or rebuild.

This is not theoretical nerd trivia. It is one of the reasons parity-write integrity mechanisms matter.

The kernel docs' discussion of partial parity log (PPL) is worth knowing at least conceptually.

Journals / Logs / Caches for RAID Writes¶

Linux MD has mechanisms such as RAID5 cache/log-related support to improve behavior and resilience/performance tradeoffs.

Conceptual point: storage stacks add logging/journaling/caching layers because crash consistency and parity consistency are hard.

Whenever you see: - battery-backed cache - journal device - write-intent bitmap - parity log

the system is trying to avoid expensive ambiguity after failure.

md vs dm-raid¶

md¶

Linux software RAID subsystem many admins know through mdadm.

dm-raid¶

Device-mapper target exposing MD RAID drivers through device-mapper interface.

Operationally you should know both exist. In many day-to-day Linux admin situations, mdadm/MD is the thing people mean.

Performance Intuition¶

Reads¶

Can benefit from striping or multiple mirrors depending on controller/software behavior.

Writes¶

Depend heavily on RAID level. Parity arrays pay extra work. Mirrors duplicate writes. Cache and stripe size matter.

Small random writes¶

Can be nasty on parity RAID.

Sequential workloads¶

May look much better.

This is why workload shape matters more than brochure numbers.

Monitoring / Operational Questions¶

You should ask: - is array clean or degraded? - any member failed, missing, rebuilding, resyncing? - bitmap/log/journal features enabled? - underlying disks showing errors? - controller cache policy sane? - filesystem above the RAID healthy? - backup recent and restorable?

Useful Commands¶

cat /proc/mdstat
mdadm --detail /dev/md0
lsblk
blkid
dmesg
journalctl -k

Also check: - SMART / device health - controller logs if hardware layer involved - filesystem health above block layer

Common Failure Patterns¶

Replacing wrong disk¶

Classic own-goal.

Array degraded unnoticed for too long¶

Then second failure arrives.

Rebuild under heavy load¶

Performance and risk both get worse.

Assuming parity protects against corruption¶

Not in the simplistic way people imagine.

Ignoring underlying drive errors¶

RAID is not a shield against reality.

No backup despite "redundancy"¶

Ancient mistake, still undefeated.

Interview-Level Things to Explain¶

You should be able to explain:

why RAID is not backup
basic tradeoffs of RAID 0/1/5/6/10
what degraded mode means
why rebuilds are risky periods
what the RAID5 write hole is at a high level
why parity writes are more complex than mirror writes

Fast Mental Model¶

RAID is a block-level availability and performance tradeoff mechanism that spreads data and sometimes parity or mirrors across devices, improving some failure/performance properties while introducing rebuild risk, consistency complexity, and level-specific write behavior.

Prerequisites¶

Datacenter & Server Hardware (Topic Pack, L1)

Storage Operations (Topic Pack, L2) — RAID, Storage (SAN/NAS/DAS)
Case Study: Backup Job Failing — iSCSI Target Unreachable, VLAN Misconfigured (Case Study, L2) — Storage (SAN/NAS/DAS)
Case Study: Database Replication Lag — Root Cause Is RAID Degradation (Case Study, L2) — RAID
Case Study: HBA Firmware Mismatch (Case Study, L2) — Storage (SAN/NAS/DAS)
Case Study: NVMe Drive Disappeared (Case Study, L2) — Storage (SAN/NAS/DAS)
Case Study: OS Install Fails RAID Controller (Case Study, L2) — RAID
Case Study: RAID Degraded Rebuild Latency (Case Study, L2) — RAID
Datacenter & Server Hardware (Topic Pack, L1) — RAID
Dell PowerEdge Servers (Topic Pack, L1) — RAID
Disk & Storage Ops (Topic Pack, L1) — RAID

RAID and Storage Internals¶

Scope¶

Big Picture¶

The First Truth About RAID¶

RAID Levels - Mental Models¶

RAID 0¶

RAID 1¶

RAID 5¶

RAID 6¶

RAID 10¶

Why Parity RAID Is Trickier¶

Rebuilds and Degraded Mode¶

The RAID5 Write Hole¶

Journals / Logs / Caches for RAID Writes¶

md vs dm-raid¶

md¶

dm-raid¶

Performance Intuition¶

Reads¶

Writes¶

Small random writes¶

Sequential workloads¶

Monitoring / Operational Questions¶

Useful Commands¶

Common Failure Patterns¶

Replacing wrong disk¶

Array degraded unnoticed for too long¶

Rebuild under heavy load¶

Assuming parity protects against corruption¶

Ignoring underlying drive errors¶

No backup despite "redundancy"¶

Interview-Level Things to Explain¶

Fast Mental Model¶

Wiki Navigation¶

Prerequisites¶

Pages that link here¶

RAID and Storage Internals¶

Scope¶

Big Picture¶

The First Truth About RAID¶

RAID Levels - Mental Models¶

RAID 0¶

RAID 1¶

RAID 5¶

RAID 6¶

RAID 10¶

Why Parity RAID Is Trickier¶

Rebuilds and Degraded Mode¶

The RAID5 Write Hole¶

Journals / Logs / Caches for RAID Writes¶

md vs dm-raid¶

md¶

dm-raid¶

Performance Intuition¶

Reads¶

Writes¶

Small random writes¶

Sequential workloads¶

Monitoring / Operational Questions¶

Useful Commands¶

Common Failure Patterns¶

Replacing wrong disk¶

Array degraded unnoticed for too long¶

Rebuild under heavy load¶

Assuming parity protects against corruption¶

Ignoring underlying drive errors¶

No backup despite "redundancy"¶

Interview-Level Things to Explain¶

Fast Mental Model¶

Wiki Navigation¶

Prerequisites¶

Related Content¶

Pages that link here¶