linux
l1
topic-pack
linux-ops-storage --- Portal | Level: L1: Foundations | Topics: Linux Ops Storage | Domain: Linux

Linux Storage Operations - Primer¶

Why This Matters¶

Every service runs on storage. When a disk fills up at 3 AM, when a database runs out of space, when a mount disappears after reboot -- you need to diagnose and fix it fast. Storage problems cause more unplanned outages than most engineers expect. Understanding the Linux storage stack from hardware to filesystem is a core ops skill.

Under the hood: Linux treats everything as a file, including block devices. The /dev/sda naming convention comes from SCSI Disk A (first disk). NVMe drives use /dev/nvme0n1 — NVMe controller 0, namespace 1. The n1 namespace is because NVMe supports multiple namespaces per controller (think of them as virtual drives). In cloud environments, you will see /dev/xvd* (Xen) or /dev/vd* (virtio).

Gotcha: The most common storage-related outage: disk full on root filesystem. When / fills up, logging stops, databases crash, and SSH may break (it needs to write temporary files). Always monitor disk usage and alert at 80%. A separate /var partition prevents log growth from killing the root filesystem.

Core Concepts¶

1. The Storage Stack¶

Application I/O
  |
VFS (Virtual File System)
  |
Filesystem (ext4, xfs, btrfs)
  |
Device Mapper / LVM / mdadm (optional)
  |
Block Device (/dev/sda, /dev/nvme0n1)
  |
Hardware (SATA, SAS, NVMe, virtio)

Every layer adds capability and complexity. Know which layers exist on your system before touching anything.

2. Block Devices and Partitions¶

Block devices represent raw storage in /dev/. Use lsblk to see the hierarchy:

$ lsblk
NAME   MAJ:MIN RM  SIZE TYPE MOUNTPOINT
sda      8:0    0   50G disk
├─sda1   8:1    0    1G part /boot
├─sda2   8:2    0   49G part
  └─vg0-root 253:0 0   49G lvm  /

Partitioning tools: - fdisk: interactive, MBR and GPT - gdisk: GPT-only, safer for modern disks - parted: scriptable, handles both MBR and GPT - GPT supports disks > 2TB and 128+ partitions - MBR is legacy, max 2TB, 4 primary partitions

Always use GPT on new systems unless hardware requires MBR.

3. Filesystems¶

Timeline: ext4 was released in 2008, evolving from ext3 (2001) → ext2 (1993) → ext (1992, by Remy Card for Linux). XFS was created by SGI in 1993 for IRIX (their Unix workstation OS) and ported to Linux in 2001. Red Hat chose XFS as the default for RHEL 7 (2014) because of its superior performance with large files and parallel I/O — particularly important for enterprise databases and media workloads.

ext4: default on most Linux distros. Journaled, stable, well-understood. Supports up to 1EB volume, 16TB file. Good general purpose choice.

XFS: excels at large files and parallel I/O. Default on RHEL/CentOS. Can grow online but cannot shrink. Better than ext4 for databases and media.

Btrfs: copy-on-write filesystem with built-in snapshots, checksums, compression, and subvolumes. More features but less battle-tested in enterprise than ext4/XFS.

Creating a filesystem:

mkfs.ext4 /dev/sda1
mkfs.xfs /dev/sdb1

4. Mounting and fstab¶

Gotcha: A bad /etc/fstab entry can prevent your system from booting. If the UUID is wrong or the device does not exist, systemd will either drop to emergency mode or hang waiting for the mount. Always use mount -a to test fstab changes before rebooting. Adding nofail to the mount options for non-critical filesystems allows the system to boot even if that mount fails.

Mount attaches a filesystem to a directory:

mount /dev/sda1 /mnt/data

Persistent mounts go in /etc/fstab. Always use UUIDs (from blkid), never /dev/sdX device names which can change between reboots.

# /etc/fstab
UUID=abc-123  /data  ext4  defaults,noatime  0  2

Fields: device, mountpoint, type, options, dump, fsck order. Use mount -a to test fstab changes without rebooting. findmnt shows the current mount tree.

5. LVM (Logical Volume Manager)¶

LVM adds a flexible abstraction layer:

Physical Volume (PV)  ->  Volume Group (VG)
  ->  Logical Volume (LV)  ->  Filesystem

Key operations:

pvcreate /dev/sdb              # init a PV
vgcreate vg_data /dev/sdb      # create VG
lvcreate -L 20G -n lv_app vg_data  # create LV
mkfs.ext4 /dev/vg_data/lv_app  # make filesystem
lvextend -L +10G /dev/vg_data/lv_app  # grow LV
resize2fs /dev/vg_data/lv_app   # grow FS (ext4)
xfs_growfs /mount/point         # grow FS (XFS)

LVM enables online resizing, snapshots, and spanning multiple physical disks into one logical volume. Check status: pvs, vgs, lvs.

Remember: The LVM layer mnemonic: "PVG" — Physical Volumes (raw disks/partitions), Volume Groups (pools of PVs), Logical Volumes (the slices you format and mount). Data flows PV → VG → LV. The pvs, vgs, lvs commands mirror this hierarchy. To extend a filesystem: lvextend first (grow the LV), then resize2fs or xfs_growfs (grow the filesystem to fill the LV).

6. Disk Monitoring¶

df: filesystem usage

df -h          # human-readable sizes
df -i          # inode usage (can fill before space)

du: directory-level usage

du -sh /var/log/*  # size of each item in /var/log
du -h --max-depth=1 /  # top-level directory sizes

iostat: I/O throughput and latency

iostat -xz 1    # extended stats, every 1 second

Watch for high await (I/O latency) and %util values.

SMART monitoring: predict disk failures

smartctl -a /dev/sda      # full SMART data
smartctl -H /dev/sda      # quick health check

Install smartmontools. Watch for reallocated sectors, pending sectors, and uncorrectable errors.

7. NFS Basics¶

Name origin: NFS was developed by Sun Microsystems in 1984 and quickly became the de facto standard for Unix file sharing. The original NFS (v2) was stateless by design — the server kept no per-client state, making it simple but limited. NFSv4 (2000, RFC 3010) added statefulness, strong security (Kerberos), and simplified firewall configuration by consolidating everything to TCP port 2049.

NFS (Network File System) shares directories over the network. Server exports a directory, clients mount it.

Server (/etc/exports):

/shared  192.168.1.0/24(rw,sync,no_subtree_check)

Client:

mount -t nfs server:/shared /mnt/nfs

Persistent: add to fstab with nfs type. Use showmount -e server to list exports. NFSv4 uses TCP port 2049 only (simpler firewalling than v3).

8. Troubleshooting Full Disks¶

When a disk is full:

Find what is consuming space:

du -h --max-depth=1 / | sort -rh | head -20

Check for deleted-but-open files holding space:
```
lsof +L1  # files with zero link count still open
```
Restarting the process releases the space.

Check inode exhaustion (many tiny files):

df -i
find / -xdev -type d -size +1M  # large directories

Common culprits: /var/log (runaway logs), /tmp (orphaned temp files), container storage, package manager cache.
Emergency: truncate -s 0 /var/log/bigfile.log (zeroes the file without removing it, so processes with open handles keep working).

What Experienced People Know¶

Device names (/dev/sda) are not stable across reboots. Always use UUIDs or labels.
xfs_growfs works on the mount point, not the device. resize2fs works on the device.
XFS cannot be shrunk. Plan capacity accordingly.
Check inodes (df -i) not just space. A filesystem with free space but no free inodes is effectively full.
LVM snapshots consume space as the original volume changes. An old snapshot can fill the VG.
lsof +L1 finds deleted files still holding disk space. This is the most common "disk full but du shows plenty of space" mystery.
NFS mounts can hang processes. Use soft,timeo=10 options or autofs for non-critical mounts.
iostat await > 10ms on SSDs or > 20ms on HDDs usually indicates a performance problem.
SMART warnings are predictive, not definitive. Replace disks showing reallocated sector growth before they fail completely.
noatime mount option reduces unnecessary writes and improves performance on most workloads.