Mounts & Filesystems Footguns¶

Mistakes that render servers unbootable, cause data loss, or create hung processes.

1. Adding an fstab entry without `nofail`¶

You add a new mount to /etc/fstab. You forget nofail. The server reboots. The disk is missing or the NFS server is down. systemd blocks boot waiting for the mount. The server never comes up. You need console access to fix a one-word omission.

Fix: Always add nofail for non-root mounts. Always run mount -a to test fstab changes before rebooting.

Gotcha: On systemd-based systems, a missing nofail mount causes the .mount unit to enter failed state, which blocks local-fs.target, which blocks multi-user.target. The entire boot chain stalls. With nofail, the mount unit can fail without blocking dependent targets.

2. Using `/dev/sdX` instead of UUID in fstab¶

You add /dev/sdb1 to fstab. A new disk is added to the server. On the next boot, disk ordering changes. /dev/sdb1 is now a different disk. The mount either fails or mounts the wrong filesystem. Data corruption or boot failure follows.

Fix: Always use UUID= in fstab. Find the UUID with blkid. Device names are not stable across reboots.

3. NFS mount without `_netdev` in fstab¶

You add an NFS mount to fstab without _netdev. On boot, systemd tries to mount the NFS share before the network is up. The mount fails or hangs. If nofail is also missing, the server never finishes booting.

Fix: NFS entries in fstab always need both _netdev (wait for network) and nofail (do not block boot on failure).

4. Hard NFS mount without timeout awareness¶

The default NFS mount is hard — retries indefinitely until the server responds. Your NFS server goes down. Every process that touches the mount enters D state (uninterruptible sleep). kill -9 does not work on D state processes. The host becomes progressively unusable.

Fix: Use hard,timeo=300,retrans=3,bg for fstab mounts. Monitor NFS server availability. Consider soft for non-critical mounts where data loss from incomplete writes is acceptable.

War story: Red Hat documented cases where systems with NFS mounts entered a hang state with hundreds of D-state processes after an NFS server became unreachable. kill -9 does not work on D-state processes because the kernel cannot interrupt uninterruptible sleep. The only recovery was either restoring the NFS server or rebooting the client. One creative workaround: bring up a fake NFS server on the same IP that rejects requests, which frees the stuck clients.

5. Lazy unmount hiding the real problem¶

You cannot unmount a filesystem. Instead of finding and stopping the processes using it, you run umount -l. The mount disappears from the namespace, but the filesystem is still in use. The underlying device cannot be safely removed. Later operations on the device cause errors or data corruption.

Fix: Use fuser -vm /mnt/data to identify processes. Stop them properly. Only use umount -l when you understand the consequences and have no better option.

6. Running fsck on a mounted filesystem¶

You run fsck /dev/sda2 while the filesystem is mounted read-write. fsck modifies filesystem structures based on its snapshot of the metadata, while the kernel continues to modify the same structures. The result is filesystem corruption.

Fix: Always unmount or remount read-only before running fsck: mount -o remount,ro /mnt/data && fsck /dev/sdb1. For root, boot into recovery mode.

Under the hood: fsck reads and writes filesystem metadata (inodes, block bitmaps, directory entries) directly on the block device. The mounted kernel also reads and writes the same metadata through its VFS cache. Neither knows about the other's changes. The result is not "maybe corruption" — it is guaranteed corruption.

7. Mounting over a non-empty directory without realizing it¶

You run mount /dev/sdb1 /var/log. The original /var/log contents are hidden, not deleted. The system continues running but all logs go to the new filesystem. When you unmount, the old logs reappear. During the mount, any log rotation or monitoring that checks file sizes gives wrong data.

Fix: Check the directory before mounting: ls /mnt/target. If it has contents, verify you intend to hide them. Use a dedicated empty directory for mount points.

8. Forgetting to persist mount options after live changes¶

You remount with new options: mount -o remount,noatime /mnt/data. The change takes effect immediately but is not in fstab. On reboot, the old options return. Performance tuning or security hardening silently reverts.

Fix: After testing mount options live, update fstab to match. Verify with findmnt -o TARGET,OPTIONS /mnt/data and compare to the fstab entry.

9. Setting `noexec` on a filesystem that needs to run scripts¶

You mount /tmp with noexec for security hardening. Package managers (apt, yum) and build tools that execute scripts from /tmp start failing with "Permission denied." The error message does not mention noexec — it looks like a permissions issue.

Fix: Before adding noexec, check what uses the filesystem. apt and dpkg extract and run scripts in /tmp. Either move their temp directory or accept the tradeoff. Same applies to /var if build tools or CI agents run there.

10. Not monitoring filesystem health and mount state¶

A filesystem remounts read-only due to disk errors. The kernel logs the issue, but nobody watches kernel logs. The application continues running but all writes fail silently. Data loss accumulates for hours before someone notices.

Fix: Monitor for read-only remounts: check /proc/mounts for ro where you expect rw. Alert on kernel messages containing EXT4-fs error or XFS error. Include mount state in your health checks.

Debug clue: dmesg | grep -i "remount" shows when a filesystem went read-only. The errors=remount-ro mount option (ext4 default) triggers this behavior. XFS does not remount read-only — it shuts down the filesystem entirely, which is even more disruptive.

Mounts & Filesystems Footguns¶

1. Adding an fstab entry without nofail¶

2. Using /dev/sdX instead of UUID in fstab¶

3. NFS mount without _netdev in fstab¶