Mounts & Filesystems — Trivia & Interesting Facts¶
Surprising, historical, and little-known facts about Linux mounts and filesystems.
The mount namespace is what makes containers possible¶
Mount namespaces, added in Linux 2.4.19 (2002), were the first namespace type implemented in the kernel. They allow each container to have its own view of the filesystem tree. Docker, LXC, and Kubernetes all rely on mount namespaces to give containers isolated root filesystems. Without this single kernel feature, Linux containers as we know them would not exist.
/proc, /sys, and /dev are not on disk¶
These are virtual filesystems synthesized by the kernel at runtime. /proc (procfs) exposes process and kernel information. /sys (sysfs) exposes device and driver data. /dev (devtmpfs) provides device nodes. None of them consume disk space, and their contents change dynamically. A fresh Linux boot creates thousands of virtual files before any real filesystem is mounted.
FUSE lets unprivileged users create filesystems¶
FUSE (Filesystem in Userspace), merged in Linux 2.6.14 (2005), allows non-root users to mount filesystems implemented as regular programs. sshfs (mount remote directories over SSH), rclone mount (mount cloud storage), and NTFS-3G (the standard NTFS driver on Linux) all use FUSE. This was revolutionary because filesystem drivers traditionally required kernel code.
The lost+found directory is created by fsck¶
Every ext2/3/4 filesystem has a lost+found directory at its root. When fsck (filesystem check) finds orphaned file data — blocks with no directory entry pointing to them — it places recovered files here with their inode number as the filename. This directory is pre-allocated with extra space so fsck can save files even when the filesystem is full.
bind mounts are the simplest and most powerful mount trick¶
mount --bind /source /target makes a directory appear at two places in the filesystem tree simultaneously. No copying occurs — both paths reference the same data. Docker uses bind mounts extensively for volume mapping. Bind mounts can even make a single file appear at a different path: mount --bind /etc/resolv.conf /container/etc/resolv.conf.
The noexec mount option is trivially bypassed but still valuable¶
Mounting /tmp with noexec prevents direct execution of files there (./exploit fails). But an attacker can bypass it with bash ./exploit or ld-linux.so ./exploit. Despite this, noexec is still recommended because it stops automated attacks that download and execute binaries. It raises the bar from "script kiddie" to "knows what they're doing."
tmpfs can use swap space, which surprises people¶
tmpfs (temporary filesystem in RAM) is not strictly RAM-only. When the system is under memory pressure, tmpfs pages can be swapped to disk just like any other anonymous memory. This means a tmpfs /tmp is not guaranteed to be fast if the system is swapping heavily. The size= option limits tmpfs size (default is half of RAM), but that memory is only consumed as files are written.
Overlay filesystems power every Docker container¶
OverlayFS (merged in Linux 3.18, 2014) layers a writable "upper" directory on top of a read-only "lower" directory. Docker uses this to stack container image layers: each layer is read-only, and the container's writes go to a thin upper layer. This makes container startup nearly instant — no copying of the base image is required. Only modified files consume additional space.
The automounter mounts filesystems on demand¶
autofs, available since Linux 2.0 (1996), mounts filesystems when a user accesses a directory and unmounts them after a timeout. This is critical in environments with thousands of NFS shares — mounting them all at boot would take minutes and consume resources. ls /nfs/server1 triggers the mount transparently. autofs is controlled by maps that define what to mount where.
Btrfs introduced snapshots and checksumming to Linux¶
Btrfs (B-tree filesystem), started by Chris Mason at Oracle in 2007, brought ZFS-like features to Linux: copy-on-write, snapshots, built-in RAID, compression, and data checksumming. Snapshots are nearly instant and space-efficient (only storing differences). Despite 15+ years of development, btrfs RAID 5/6 is still not considered production-safe, limiting enterprise adoption.
The "everything is a file" principle extends to block devices¶
In Linux, hard drives (/dev/sda), partitions (/dev/sda1), and even RAM (/dev/mem) are accessible as files. You can read a disk's raw bytes with cat /dev/sda | xxd | head, copy an entire disk with dd if=/dev/sda of=/dev/sdb, or create a filesystem image with dd if=/dev/sda1 of=backup.img. This file abstraction for hardware was a radical Unix innovation in the 1970s.