RAID vs Backup vs Snapshot¶

Mental model¶

Three forms of data protection. Each defends against a different failure. RAID handles disk death. Backup handles everything. Snapshot handles "I just broke something."

What it looks like¶

"We have RAID, so we're protected" — a dangerous conflation.

What it really is¶

RAID (Redundant Array of Independent Disks): real-time redundancy. Data is mirrored or striped with parity across multiple disks. If a disk fails, the array continues operating. Reconstruction happens automatically when the failed disk is replaced.

RAID-1: mirror (2 disks, 1x usable)
RAID-5: striped with single parity (N disks, N-1 usable)
RAID-6: striped with double parity (N disks, N-2 usable)
RAID-10: mirrored stripes (4+ disks, 50% usable)

Backup: a separate copy of data at a specific point in time, stored on independent media or location. Survives disk failure, corruption, accidental deletion, ransomware, fire, and operator error. The only protection that covers all failure modes.

Snapshot: a frozen view of filesystem state at a moment. Created near-instantly using copy-on-write (COW). Shares storage with the original data — only changed blocks consume additional space. Fast to create and revert.

Why it seems confusing¶

All three feel like "data protection." But they protect against different failures and have fundamentally different properties.

What actually matters¶

RAID is NOT a backup: - Delete a file on RAID → deleted on all mirrors instantly. - Filesystem corruption on RAID → corrupted on all mirrors instantly. - Ransomware encrypts on RAID → encrypted on all mirrors instantly. - RAID only protects against physical disk failure.

Snapshot is NOT a backup: - Snapshots usually live on the same physical device. - If the device fails, both original and snapshots are lost. - Snapshots consume space as the original diverges (COW overhead).

The 3-2-1 rule: 3 copies, 2 different media types, 1 offsite.

Common mistakes¶

Relying on RAID as the only data protection strategy.
Treating snapshots as backups (same device, same failure domain).
Not testing backup restores (untested backups are not backups).
Forgetting that RAID rebuild time is a vulnerability window (a second disk failure during rebuild kills the array).
Thinking snapshots are free (COW overhead grows with divergence).

Small examples¶

RAID — survives disk failure, not deletion:

# RAID-1 mirror: /dev/md0 = sda1 + sdb1
mdadm --detail /dev/md0     # State: active, 2/2 disks
rm /data/important.db        # deleted from BOTH mirrors
# RAID did not help.

Backup — survives everything:

# Daily rsync to separate server
rsync -a /data/ backup-server:/backups/data/$(date +%F)/
# File deleted locally? Restore from backup.
# Disk failed? Restore from backup.
# Building burned down? Restore from offsite backup.

Snapshot — fast undo for logical errors:

# LVM snapshot before risky operation
lvcreate -s -L 5G -n snap /dev/vg0/data
# Do risky thing...
# Oops, revert:
lvconvert --merge /dev/vg0/snap

# ZFS snapshot
zfs snapshot pool/data@before-migration
zfs rollback pool/data@before-migration

One-line summary¶

RAID survives disk failure; backup survives everything; snapshot is fast undo on the same device.