Linux Ops Storage — Trivia & Interesting Facts¶
Surprising, historical, and little-known facts about Linux storage management.
LVM was inspired by HP-UX and AIX, not invented for Linux¶
Logical Volume Manager on Linux was ported from HP-UX's LVM implementation by Heinz Mauelshagen in 1998. The LVM2 rewrite (2003) used device-mapper, a kernel framework that also powers dm-crypt, dm-raid, and Docker's devicemapper storage driver. The concept of abstracting physical disks into logical volumes dates back to IBM mainframes in the 1970s.
ext4 was meant to be a stopgap filesystem¶
ext4 was created in 2006 as a "bridge" filesystem while btrfs was being developed as ext's true successor. Theodore Ts'o, ext4's lead developer, explicitly said ext4 was buying time for btrfs. Nearly 20 years later, ext4 remains the default filesystem on most Linux distributions while btrfs adoption remains limited, primarily used by Fedora and openSUSE.
XFS was created by SGI in 1993 for handling large media files¶
XFS was developed by Silicon Graphics for IRIX to handle the massive files generated by Hollywood visual effects studios. It was open-sourced and ported to Linux in 2001. XFS excels at parallel I/O and large files — it supports files up to 8 exabytes. RHEL 7 (2014) switched its default filesystem from ext4 to XFS, a decision that surprised many.
The dd command name means "data definition" — from IBM JCL¶
dd gets its syntax (if=, of=, bs=) from IBM's Job Control Language (JCL), where DD stands for "Data Definition." This is why dd's syntax is completely unlike any other Unix command. The joke backronym "disk destroyer" reflects the reality that dd if=/dev/zero of=/dev/sda will destroy a disk without confirmation.
RAID 5 is considered dangerous for large drives¶
RAID 5 protects against a single drive failure, but during rebuild of a large drive (8TB+), the chance of a second Unrecoverable Read Error (URE) is statistically significant. A typical desktop drive has a URE rate of 1 in 10^14 bits (about 12.5 TB of reads). Rebuilding an 8 TB array reads enough data to make a second failure likely. This is why RAID 6 or RAID 10 is recommended for large drives.
/dev/null has been a device since 1971¶
The null device (/dev/null) appeared in Version 1 Unix at Bell Labs. It discards all data written to it and returns EOF on read. It is character device major 1, minor 3 on Linux. The concept of a "bit bucket" that swallows data predates Unix — IBM mainframes had similar constructs — but Unix's implementation as a file in the filesystem was elegant and influential.
Thin provisioning lets you lie about disk space¶
LVM thin provisioning and storage systems like ZFS allow allocating more logical storage than physically exists, banking on the assumption that most space will not be used immediately. A 100 TB thin pool might only have 10 TB of physical disks. This works until all tenants actually use their allocations simultaneously — an event called "thin pool exhaustion" that causes I/O errors.
ZFS was designed to last until the heat death of the universe¶
ZFS's 128-bit storage addressing can theoretically address 256 quadrillion zettabytes. Jeff Bonwick, ZFS's creator at Sun Microsystems, calculated that fully populating a 128-bit storage pool would require more energy than boiling the oceans. ZFS's name originally stood for "Zettabyte File System," though Oracle later said it does not stand for anything.
The "sync" command exists because Unix caches writes¶
Unix has always used write caching — data written to files sits in kernel buffer cache before being flushed to disk. The sync command forces all cached writes to disk. In early Unix, forgetting to run sync before shutdown could corrupt the filesystem. The mantra "sync; sync; sync" before pulling the power was common practice and partly superstitious (one sync is sufficient).
SMART data predicts drive failure — but most people never check it¶
SMART (Self-Monitoring, Analysis, and Reporting Technology) has been built into every hard drive and SSD since the mid-2000s. smartctl -a /dev/sda shows temperatures, error counts, reallocated sectors, and power-on hours. Google's 2007 study of 100,000 drives found that drives with even one reallocated sector were 14x more likely to fail within 60 days.
NVMe bypasses the entire traditional storage stack¶
NVMe (Non-Volatile Memory Express) was designed from scratch for SSDs, replacing the AHCI interface originally designed for spinning disks. NVMe supports 65,535 I/O queues with 65,536 commands each, compared to SATA's single queue of 32 commands. This is why NVMe SSDs can achieve 7 GB/s reads while SATA SSDs max out at 550 MB/s — the interface, not the flash, was the bottleneck.