Skip to content

Linux Storage Street Ops

Emergency Disk Space Recovery

When a filesystem is at 100%, you need space NOW. Here is the priority order:

Step 1: Find what's eating space

df -h                           # which filesystem is full
du -sh /* 2>/dev/null | sort -rh | head -20   # biggest top-level dirs
du -sh /var/* 2>/dev/null | sort -rh | head -20   # usually /var is the culprit

Step 2: Quick wins

# Clear package manager cache
apt-get clean                   # Debian/Ubuntu
yum clean all                   # RHEL/CentOS
dnf clean all                   # Fedora

# Clear journal logs
journalctl --vacuum-size=100M

# Find large files
find / -xdev -type f -size +100M -exec ls -lh {} \; 2>/dev/null

# Find old log files
find /var/log -type f -name "*.gz" -mtime +30 -delete
find /var/log -type f -name "*.old" -delete

# Truncate active log files (don't delete -- process has open file handle)
> /var/log/large-active.log

Gotcha: du and df can disagree significantly. df reports filesystem-level usage (includes deleted-but-open files). du walks the directory tree (misses deleted files held open by processes). If df shows 95% full but du -sh / only accounts for 60%, the gap is deleted files still held open. Check with lsof +L1.

Step 3: The deleted-but-open-file trick

# Files deleted but held open by processes still consume space
lsof +L1   # shows deleted files still open
# Either restart the process or:
> /proc/<pid>/fd/<fd_number>   # truncate the open fd

Step 4: Reserved blocks (ext4 only)

# ext4 reserves 5% for root. On a 1TB disk, that's 50GB.
tune2fs -m 1 /dev/sdX1   # reduce to 1% (safe for data volumes, not root)

LVM Operations: The Safety Playbook

Before any LVM change: snapshot first

lvcreate -L 5G -s -n backup_snap /dev/vg0/data
# Now you have a rollback point. If anything goes wrong:
lvconvert --merge /dev/vg0/backup_snap
# (requires reboot if the LV is mounted as root)

Extending a logical volume (the safe way)

# 1. Check available space
vgs                     # look at VFree column
# 2. Extend the LV
lvextend -L +10G /dev/vg0/data
# 3. Resize the filesystem
resize2fs /dev/vg0/data     # ext4
xfs_growfs /mountpoint       # XFS (note: takes mountpoint, not device)

Critical: XFS can only grow, never shrink. ext4 can do both (but shrinking requires unmounting).

Adding a new disk to LVM

pvcreate /dev/sdb
vgextend vg0 /dev/sdb
# Now VFree increases. Extend LVs as needed.

Reducing an LV (ext4 only, dangerous)

# MUST unmount first
umount /mountpoint
e2fsck -f /dev/vg0/data           # mandatory fsck first
resize2fs /dev/vg0/data 50G       # shrink filesystem first
lvreduce -L 50G /dev/vg0/data     # then shrink LV
mount /mountpoint

Never shrink the LV before shrinking the filesystem. You will lose data.

War story: An admin ran lvreduce before resize2fs on a 2 TB production volume. The LV shrank, cutting off the last 500 GB of filesystem data. The filesystem was instantly corrupt. Recovery required a full restore from backup — 14 hours of downtime. The correct order (shrink filesystem first, then LV) exists because the filesystem needs to relocate data out of the space being removed.

fstab: The File That Can Brick Your Server

Safe fstab practices

# Always use UUID
UUID=abc123-def456 /data ext4 defaults 0 2

# For non-critical mounts, use nofail
UUID=abc123-def456 /data ext4 defaults,nofail 0 2

# For NFS and network mounts
server:/share /mnt/nfs nfs defaults,_netdev,nofail 0 0

Testing fstab changes before reboot

# After editing fstab:
mount -a         # try to mount everything
echo $?          # 0 = success
# Also test with:
findmnt --verify # checks fstab syntax

Recovering from a bad fstab entry

If the system won't boot: 1. Boot to rescue/single-user mode. 2. Root filesystem may be read-only. Remount: mount -o remount,rw /. 3. Fix /etc/fstab. 4. Reboot.

Alternative: Add nofail to every non-root entry. System will boot even if mounts fail.

Mount Options That Matter

Option Effect When to use
noatime Don't update access time on reads Almost always -- reduces write I/O significantly
nodiratime Don't update directory access times Implied by noatime
relatime Update atime only if older than mtime Default on modern kernels, good middle ground
discard Enable continuous TRIM for SSDs SSDs only. Or use fstrim.timer instead (preferred)
nobarrier Disable write barriers ONLY if you have battery-backed cache. Data loss risk.
data=journal Journal data too (ext4) Maximum safety, significant performance cost
data=writeback Don't journal data (ext4) Maximum performance, risk of stale data on crash
noquota Disable quota tracking When quotas not needed and you want less overhead
errors=remount-ro Remount read-only on error Default for ext4, good safety net

Performance-relevant defaults

For a data volume on SSD:

UUID=xxx /data ext4 defaults,noatime 0 2
Plus enable the fstrim timer:
systemctl enable --now fstrim.timer

For a data volume on spinning disk with heavy writes:

UUID=xxx /data xfs defaults,noatime,logbufs=8 0 2

Filesystem Checks and Recovery

ext4

# Never fsck a mounted filesystem
umount /dev/sdX1
e2fsck -f /dev/sdX1        # force check even if clean
e2fsck -p /dev/sdX1        # automatic repair (safe fixes only)
e2fsck -y /dev/sdX1        # answer yes to everything (dangerous but sometimes necessary)

XFS

umount /dev/sdX1
xfs_repair /dev/sdX1       # standard repair
xfs_repair -L /dev/sdX1    # zero the log (last resort, possible data loss)

When you can't unmount

# Check what's using the mount
fuser -mv /mountpoint
lsof +D /mountpoint
# Then kill the processes or use lazy unmount:
umount -l /mountpoint      # detaches but waits for users to finish

NVMe Specifics

NVMe naming is different:

/dev/nvme0n1       # first NVMe controller, first namespace
/dev/nvme0n1p1     # first partition
/dev/nvme1n1       # second controller

Key NVMe commands:

nvme list                    # show NVMe devices
nvme smart-log /dev/nvme0n1  # SMART health data
nvme id-ctrl /dev/nvme0n1    # controller identification

Watch for: - Percentage Used in SMART data -- SSDs wear out. - Temperature warnings. - Media Errors -- any non-zero value is concerning.

Debug clue: NVMe drives report "Percentage Used" as a lifetime wear indicator. At 100%, the drive has used its rated write endurance — but it may keep working. At 200%, you are on borrowed time. Monitor this value monthly and plan replacements before 80%. Check with nvme smart-log /dev/nvme0n1 | grep percentage_used.

mdadm Software RAID

Check RAID status

cat /proc/mdstat           # quick status
mdadm --detail /dev/md0    # detailed status

Common RAID states

  • clean/active: Normal operation.
  • degraded: A disk has failed. REPLACE IMMEDIATELY.
  • rebuilding/recovering: A replacement disk is syncing. Performance will be reduced.

Replace a failed disk

mdadm --manage /dev/md0 --remove /dev/sdb1    # remove failed disk
# Physically replace the disk
mdadm --manage /dev/md0 --add /dev/sdb1        # add replacement
# Rebuild starts automatically. Monitor with:
watch cat /proc/mdstat

Decision Tree: Which Filesystem?

Do you need snapshots/checksums?
  Yes -> btrfs (or ZFS if you can handle licensing)
  No  -> Do you need to shrink the filesystem?
           Yes -> ext4
           No  -> Will you have very large files (>1TB)?
                    Yes -> XFS
                    No  -> Will you need to shrink the filesystem later?
                             Maybe -> ext4
                             No    -> either ext4 or XFS (both fine)

Default answer: ext4 for root and small volumes, XFS for large data volumes.

Failure Modes You Must Recognize

Symptom Likely Cause
No space left on device but df shows free space Inode exhaustion: df -i to check
Filesystem suddenly read-only Disk errors detected, errors=remount-ro triggered. Check dmesg for I/O errors
mount: wrong fs type, bad option Filesystem corrupted or wrong filesystem type specified
LV extend worked but df shows old size Forgot to resize the filesystem after extending the LV
System won't boot after adding fstab entry Missing nofail option on a mount that doesn't come up
device is busy on umount Process has open files on the mount. Use fuser -mv
RAID degraded after reboot Check which disk failed: mdadm --detail. dmesg | grep error

Heuristics

  1. Always snapshot before LVM operations. The 30 seconds to create a snapshot saves hours of data recovery.
  2. Never resize a production filesystem without testing the exact steps on a non-production system first.
  3. Use lsblk before every disk operation to confirm you're touching the right device. There is no undo for mkfs on the wrong device.
  4. XFS cannot shrink. If you might need to shrink, use ext4.
  5. nofail is your friend. Add it to every non-root fstab entry on servers.
  6. TRIM: use fstrim.timer, not discard mount option. Continuous discard adds latency to every delete. Weekly fstrim batches it.
  7. Check inode usage alongside block usage. Millions of tiny files can exhaust inodes with plenty of block space remaining.

Power One-Liners

Disk usage — biggest directories first

du -xh --max-depth=2 / 2>/dev/null | sort -rh | head -30

Breakdown: -x stays on one filesystem. --max-depth=2 prevents going too deep. sort -rh sorts human-readable sizes descending. Stderr redirected because permission denied noise.

[!TIP] When to use: Root filesystem filling up — need to find where the space went, fast.

Tar with encryption over SSH (compressed, bandwidth-limited)

tar czf - /critical/data | openssl enc -aes-256-cbc -pbkdf2 | ssh backup-host 'cat > /backups/data.tar.gz.enc'

[!TIP] When to use: Encrypted backup to remote host without intermediate plaintext on disk.

Tar transfer between two hosts through your machine

ssh host1 'tar czf - /data' | ssh host2 'tar xzf - -C /data'

Breakdown: Host1 tars and streams to your machine's stdout, your machine's stdin pipes to host2 which extracts. Your machine is the pipe — data never touches your disk.

[!TIP] When to use: Migrating data between hosts that can't talk directly to each other.

See Also