mergerfs Footguns¶
Mistakes that cause data loss, broken setups, or confusing behavior with mergerfs.
1. Including the Parity Drive in the mergerfs Pool¶
Adding your SnapRAID parity drive to the mergerfs pool is the single most common and destructive mergerfs mistake. mergerfs will write user data to the parity drive, overwriting parity calculations. Your entire SnapRAID protection is silently destroyed.
How it happens: using /mnt/* or /mnt/disk* as the branch glob when the parity drive is also mounted under /mnt/.
# WRONG -- parity1 is caught by the glob
/mnt/* /storage fuse.mergerfs defaults 0 0
# RIGHT -- only data drives
/mnt/disk* /storage fuse.mergerfs defaults 0 0
Fix: use a naming convention that separates data and parity mount points. Data drives at /mnt/disk1, /mnt/disk2, etc. Parity drives at /mnt/parity1, /mnt/parity2. Then glob only /mnt/disk*.
2. Recursive Mount (Mounting mergerfs into Its Own Source)¶
If a mergerfs mount at /storage includes /mnt/disk1, and /storage is itself a subdirectory of /mnt/, you can create an infinite recursion where mergerfs tries to read from itself.
How it happens: mounting mergerfs at /mnt/storage while using /mnt/* as the branch glob.
Symptoms: the system hangs, ls /storage never returns, or you get I/O errors.
Fix: always mount mergerfs at a path that is NOT inside or overlapping with any branch path. Common safe patterns:
- Branches at /mnt/disk*, pool at /storage
- Branches at /mnt/data/disk*, pool at /media
If you accidentally created a recursive mount, unmount with sudo umount /storage (may require umount -l for a lazy unmount if it is hung). Verify with mount | grep mergerfs.
3. Shell Glob Expansion Eating Branch Paths¶
On the command line, the shell expands * before mergerfs sees it. This means:
# WRONG -- shell expands /mnt/disk* to /mnt/disk1 /mnt/disk2 (space-separated)
mergerfs /mnt/disk*:/mnt/ssd /storage
# RIGHT -- escape the glob
mergerfs /mnt/disk\*:/mnt/ssd /storage
# or quote it
mergerfs '/mnt/disk*:/mnt/ssd' /storage
In /etc/fstab, there is no shell expansion, so globs work as-is:
Additional gotcha: globbing only occurs at mount time. If you add /mnt/disk5 after mounting with /mnt/disk*, it will NOT automatically appear in the pool. You must either remount or add it via xattr at runtime.
4. moveonenospc Only Affects Writes, NOT Creates¶
moveonenospc=true catches ENOSPC errors during write operations on already-open files. It does NOT affect the create policy. If the create policy selects a full branch to create a new file, the create fails with ENOSPC and moveonenospc does not intervene.
How it bites you: you set moveonenospc=true and assume you are protected from full drives. Then a new file creation fails because minfreespace was set too low and the create policy selected a nearly-full branch.
Fix: set minfreespace high enough to cover your largest expected file plus overhead. For media servers, minfreespace=250G is common. moveonenospc is a safety net for writes that happen to fill a drive mid-operation, not a replacement for proper space management via minfreespace.
5. rename-exdev Symlink Mode Breaking Applications¶
Setting rename-exdev=rel-symlink or rename-exdev=abs-symlink makes cross-branch renames succeed by creating symlinks instead of returning EXDEV errors. However, some applications check the file type after renaming and reject symlinks.
Applications known to have issues: some media managers, backup tools, and archival software that call lstat() after rename and expect a regular file.
Symptoms: the rename appears to succeed (no error), but the application reports a corrupt or missing file because it encounters a symlink where it expected a regular file.
Fix: either keep rename-exdev=passthrough (the default) and structure your directory layout so related files land on the same branch (use epmfs or eppfrd policies), or test your specific applications with the symlink mode before deploying to production.
6. Forgetting dropcacheonclose with Media Servers¶
Without dropcacheonclose=true, streaming a 50 GB movie file through Plex fills the Linux page cache with data that will never be read again. This evicts actually-useful cached data (filesystem metadata, application data) and can cause the system to feel sluggish.
Symptoms: after extended streaming, the system becomes slow for other tasks. free -h shows nearly all memory in "buff/cache" with no available memory for new allocations.
Fix: always set dropcacheonclose=true for media server workloads. This calls posix_fadvise(DONTNEED) on file close, telling the kernel it can reclaim that page cache.
Additionally, set cache.files=off to prevent FUSE-level caching of file data.
7. minfreespace Set Too Low¶
The default minfreespace=4G is dangerously low for most setups. With large media files (20-50 GB for 4K remuxes), a branch can fill completely between the threshold check and the write completion.
How it bites you: with minfreespace=4G, a 40 GB file starts writing to a branch with 5 GB free. The create policy accepted the branch (5G > 4G threshold), but the write fills the drive. If moveonenospc is not enabled, the write fails.
Fix: set minfreespace to at least 2x your largest expected file. For media servers: minfreespace=100G to minfreespace=250G. For general NAS with smaller files: minfreespace=20G to minfreespace=50G.
8. Not Running snapraid sync After Adding Files¶
When you add new files to the mergerfs pool, they are NOT protected by SnapRAID parity until the next snapraid sync runs. The window between file creation and sync is a vulnerability period.
How it bites you: you copy 2 TB of data to the pool, a drive fails before the nightly sync, and those files are unrecoverable.
Fix: run snapraid sync after large data ingestion operations. Automate with a cron job (daily minimum). For critical data, run sync immediately after the copy completes.
9. NFS Export Issues with fsid and crossmnt¶
Exporting a mergerfs mount over NFS requires explicit fsid settings because NFS identifies filesystems by device number, and mergerfs (being FUSE) has a synthetic device number that NFS may not handle correctly.
Symptoms: NFS clients see stale file handles, cannot access newly created files, or get "Permission denied" errors on files that clearly have correct permissions.
Fix: in /etc/exports, set an explicit fsid for each export:
Also consider:
- nfsopenhack=all in mergerfs options for file creation issues
- symlinkify=true for better performance on read-heavy NFS exports
- use_ino in mergerfs options for consistent inode numbers across NFS remounts
10. Confusing Pool View with Branch View for SnapRAID Operations¶
SnapRAID operates on individual branches (/mnt/disk1, /mnt/disk2), NOT on the mergerfs pool (/storage). Running snapraid fix or snapraid check with paths through the pool mount can produce incorrect results or fail.
How it bites you: you try to verify a specific drive's data through the pool path, but mergerfs routes the reads to a different branch than expected.
Fix: always interact with SnapRAID using branch paths. Reference /mnt/diskN directly, never /storage.
# RIGHT
snapraid fix -d d1 # fixes data on disk1's branch
snapraid check -d d2 # checks disk2's branch
# WRONG
snapraid fix /storage/... # confusing, may not target the right branch
11. Stale Branches After Drive Failure¶
If a drive fails or is unmounted while mergerfs is running, the branch remains in the pool configuration but returns I/O errors for any file on that branch. Other branches still work fine, but directory listings that should include files from the failed branch will show errors.
Symptoms: intermittent I/O errors when listing certain directories. Some files accessible, others not.
Fix: immediately remove the failed branch from the pool:
This is safe -- it only removes the branch from the routing table. No data is deleted. Re-add after the drive is replaced and repaired.
12. Running snapraid sync After a Drive Loss¶
If a drive is lost and you run snapraid sync before snapraid fix, SnapRAID recalculates parity without the data on the lost drive. The parity information needed to recover the lost data is permanently overwritten.
Fix: when a drive fails, the order is:
1. snapraid fix -d dN (recover data to replacement drive)
2. Verify recovered data
3. snapraid sync (recalculate parity with all drives present)
Never sync before fix. This is the SnapRAID equivalent of rm -rf /.