Skip to content

Mergerfs

← Back to all decks

41 cards — 🟢 12 easy | 🟡 16 medium | 🔴 7 hard

🟢 Easy (12)

1. What is mergerfs?

Show answer A FUSE-based union filesystem that logically combines multiple filesystem paths into a single mount point. Created by Antonio SJ Musumeci (trapexit), licensed under ISC.

Key distinction: mergerfs is NOT RAID -- it provides no redundancy, parity, or striping. It is a thin routing layer that proxies filesystem operations to underlying branches.

Name origin: "merger filesystem" -- it merges directory trees from multiple drives into one unified view.

2. What is the difference between a branch and a pool in mergerfs?

Show answer A **branch** is a single filesystem path that participates in the mergerfs union (e.g., /mnt/disk1). The **pool** is the mergerfs mount point -- the unified view of all branches (e.g., /storage).

Branches are specified as a colon-delimited list: /mnt/disk1:/mnt/disk2:/mnt/disk3

Globs are supported in fstab: /mnt/disk* merges all matching paths.

3. What are the three branch modes in mergerfs (RW, RO, NC)?

Show answer **RW** (Read-Write): default, eligible for all policy categories.
**RO** (Read-Only): excluded from create and action policies, only participates in search.
**NC** (No-Create): excluded from create policies only, still allows modifications and deletions.

Syntax: /mnt/disk1=RW:/mnt/disk2=NC:/mnt/disk3=RO

Use NC to gracefully drain a drive -- it stops receiving new files but existing files can still be modified or deleted.

4. What are the three policy categories in mergerfs?

Show answer **Create**: controls where new files/directories are placed (create, mkdir, mknod, symlink)
**Action**: controls which branches are affected by modifications (chmod, chown, rename, unlink, etc.)
**Search**: controls how files are located (open, getattr, access, readlink, etc.)

Mnemonic: **CAS** -- Create, Action, Search -- like CAS latency in RAM, every access has a lookup cost.

5. What is the default action policy and why?

Show answer **epall** (Existing Path, All) is the default action policy. It applies the operation to ALL branches where the target path exists.

Why: when you chmod or rename a file, all instances across branches should be affected. If a file exists on multiple branches (e.g., a directory), epall ensures the operation is consistent everywhere.

6. What is the default search policy (ff) and how does it work?

Show answer **ff** = First Found. Returns the first matching file in branch mount order.

Fast and simple -- stops searching as soon as a match is found. Branch order matters: branches listed first are checked first.

This is why listing an SSD branch first with ff as the search policy gives you fast lookups for frequently accessed files.

7. Write a basic mergerfs fstab line for a media server with four drives.

Show answer /mnt/disk* /storage fuse.mergerfs defaults,allow_other,use_ino,cache.files=off,moveonenospc=true,dropcacheonclose=true,minfreespace=250G,category.create=epmfs,fsname=mergerfs 0 0

Key options explained:
- allow_other: non-root access (Docker containers)
- use_ino: consistent inodes for NFS/Samba
- cache.files=off: prevents stale metadata
- dropcacheonclose=true: prevents page cache pollution
- epmfs: path-preserving create policy

8. What does cache.files control and what are its modes?

Show answer Controls page caching for file data in FUSE:

- **off** (default): no caching, safest for multi-writer
- **partial**: cached while file handle is open
- **full**: cached across opens
- **auto-full**: cached if mtime/size unchanged between opens
- **per-process**: selective caching by process name

For media servers, use off. For database-like workloads, auto-full + cache.writeback=true can help.

9. How do you mount mergerfs storage into Docker containers?

Show answer Bind-mount the mergerfs pool path into containers:

volumes:
- /storage/media:/data/media
- /storage/downloads:/data/downloads

Key practices:
- Use PUID=1000 and PGID=1000 on all containers for consistent ownership
- Put application configs on SSD, not mergerfs (e.g., Plex DB on NVMe)
- Point containers at the pool mount (/storage), not individual branches

10. What does mergerfs NOT provide?

Show answer mergerfs does NOT provide:
- RAID (no striping, mirroring, or parity)
- Redundancy (losing a drive loses its files)
- Checksums or data integrity verification
- Snapshots
- Automatic rebalancing of existing files
- File splitting across branches
- Copy-on-write semantics like OverlayFS

For redundancy, pair with SnapRAID. For checksums, use btrfs/ZFS on the branches or SnapRAID scrub.

11. How do you check which physical branch a file resides on?

Show answer getfattr -n user.mergerfs.fullpath /storage/path/to/file

This returns the actual path on the underlying branch, e.g., /mnt/disk2/path/to/file.

Useful for debugging policy behavior, verifying file placement after migration, and planning drive removal (find all files on a specific branch).

12. What license is mergerfs released under?

Show answer The ISC license -- one of the most permissive open-source licenses. It is a two-clause license (similar to simplified BSD/MIT) that permits personal and commercial use with minimal restrictions.

ISC = Internet Systems Consortium, the same organization behind BIND (DNS server).

The creator, Antonio SJ Musumeci (trapexit), chose ISC for maximum adoption with minimum legal overhead.

🟡 Medium (16)

1. What is the default create policy (pfrd) and how does it work?

Show answer **pfrd** = Probabilistic Free-space Random Distribution. Selects a branch randomly with probability proportional to available space. A branch with 2 TB free is twice as likely to be selected as one with 1 TB free.

This naturally balances data across drives without strict round-robin. Drives with more free space receive more files, approaching even fill over time.

Default for: category.create

2. What does the epmfs policy do and when should you use it?

Show answer **epmfs** = Existing Path, Most Free Space. Among branches where the parent directory already exists, picks the one with the most free space.

Use case: media servers. When Sonarr adds a new episode to /storage/tv/ShowName/Season02/, epmfs ensures it lands on the same drive that already has the Season02 directory. This keeps series data colocated for better read performance.

The 'ep' prefix means "only consider branches with an existing path."

3. What do the ep and msp policy prefixes mean?

Show answer **ep** (Existing Path): only considers branches where the target directory already exists. Falls back to nothing if no branch has the path.

**msp** (Most Specific Path): like ep* but if the exact path does not exist, retries with the parent directory, then grandparent, etc.

Examples:
- epmfs: among existing paths, most free space
- mspmfs: like epmfs, but walks up directory tree if needed
- eppfrd: among existing paths, weighted random by free space
- msppfrd: like eppfrd, walks up if needed

4. What does minfreespace do and what is a good value?

Show answer Branches with less than minfreespace available are excluded from **create** policies. Default is 4G, which is dangerously low for media setups.

Does NOT affect action or search policies. Does NOT prevent writes to already-open files on full branches (that is moveonenospc's job).

Recommended values:
- Media server: 100G-250G
- General NAS: 20G-50G
- Set to at least 2x your largest expected file size.

5. What does moveonenospc do and what is its critical limitation?

Show answer When a write fails with ENOSPC (drive full), mergerfs moves the file to another branch and retries the write.

Critical limitation: moveonenospc only affects **writes**, NOT creates. If the create policy selects a full branch for a new file, the create fails with ENOSPC and moveonenospc does NOT intervene.

This is why minfreespace must be set high enough -- it is the create-time protection. moveonenospc is only the write-time safety net.

6. What does dropcacheonclose do and why is it important for media servers?

Show answer When enabled, mergerfs calls posix_fadvise(DONTNEED) when a file is closed, telling the kernel to drop that file's page cache.

Without it, streaming a 50 GB movie through Plex fills the page cache with data that will never be read again, evicting useful cached data and causing system sluggishness.

Always enable for media server workloads: dropcacheonclose=true

7. How do you query and change mergerfs settings at runtime?

Show answer Use the .mergerfs pseudo-file at the mount point with xattr operations:

Query all: getfattr -d /storage/.mergerfs
Read one: getfattr -n user.mergerfs.category.create /storage/.mergerfs
Change: setfattr -n user.mergerfs.category.create -v mfs /storage/.mergerfs

Add branch: setfattr -n user.mergerfs.srcmounts -v '+>/mnt/disk5' /storage/.mergerfs
Remove branch: setfattr -n user.mergerfs.srcmounts -v '-/mnt/disk3' /storage/.mergerfs

Changes are NOT persisted -- update fstab to persist.

8. What is a recursive mergerfs mount and how do you avoid it?

Show answer Mounting mergerfs at a path that is inside one of its own branch paths. Example: branches at /mnt/* with pool at /mnt/storage -- the pool is caught by its own glob.

Symptoms: system hangs, ls never returns, I/O errors.

Fix: always mount the pool outside the branch namespace. Branches at /mnt/disk*, pool at /storage. If stuck, use umount -l for lazy unmount.

9. When would you use lup (Least Used Percentage) as a create policy?

Show answer Use lup when you have drives of different sizes and want to keep them balanced by percentage rather than absolute free space.

Example: a 4 TB drive at 50% and a 14 TB drive at 50% -- lup treats them as equally full. mfs would always pick the 14 TB drive (7 TB free vs 2 TB free).

Best for: general NAS with mixed drive sizes where visual balance matters.

10. What NFS export settings are needed for mergerfs?

Show answer 1. Set explicit fsid per export: fsid=1 (NFS needs stable device IDs, FUSE provides synthetic ones)
2. Enable nfsopenhack=all in mergerfs options (fixes file creation issues)
3. Use crossmnt for subtree exports
4. Set use_ino in mergerfs for consistent inodes

Example: /storage 192.168.1.0/24(rw,fsid=1,no_subtree_check,crossmnt,all_squash,anonuid=1000,anongid=1000)

11. How do func. and category. overrides interact?

Show answer category.* sets the policy for all functions in that category:
category.create=mfs (affects create, mkdir, mknod, symlink)

func.* sets the policy for one specific function:
func.mkdir=all (only affects mkdir)

func.* overrides take precedence over category.*. So you can set category.create=mfs but override func.mkdir=all to create directories on all branches while creating files on the most-free-space branch.

12. What is the all policy and when is it useful?

Show answer The **all** policy applies the operation to every branch. For mkdir/mknod/symlink, it creates the directory on all branches. For create (file creation), it acts like ff.

Common use: func.mkdir=all ensures directory structures exist on all branches, so future epmfs/eplfs create operations always find an existing path.

This is why the default action policy is epall -- modifications should propagate to all branches that have the file.

13. Why does glob-based branch specification only work at mount time?

Show answer mergerfs evaluates glob patterns (/mnt/disk*) only when the filesystem is mounted. If you add /mnt/disk5 after mounting, it will NOT automatically appear in the pool.

To add new branches after mount, either:
1. Remount: umount /storage && mount /storage
2. Use xattr: setfattr -n user.mergerfs.srcmounts -v '+>/mnt/disk5' /storage/.mergerfs

The xattr method requires no downtime -- services keep running.

14. How does fuse_msg_size affect mergerfs performance?

Show answer fuse_msg_size controls the maximum FUSE message size in pages (4 KiB per page). Default is 256 (1 MiB max). Available on Linux 4.20+ via the max_pages feature.

Doubling the message size approximately halves the number of kernel-to-userspace round trips for large I/O. Since mergerfs defaults to the maximum (256), there is no reason to change it.

If you see it set lower than 256, increase it. The only cost is slightly higher memory usage for message buffers.

15. What is the newest policy and when would you use it?

Show answer **newest** selects the branch with the largest mtime (modification time) on the target file or directory.

Use case: when files may exist on multiple branches (e.g., after a migration or manual copy) and you want operations to target the most recently modified version.

Not commonly used as a create policy. More useful as a search policy when you have overlapping content across branches and want the freshest version.

16. What is the difference between statfs=base and statfs=full?

Show answer **statfs=base** (default): uses all branches for df/statfs calculations. The pool always reports total capacity across all drives.

**statfs=full**: only includes branches where the queried path exists. If /storage/movies only exists on disk1 and disk2, df /storage/movies only reports those drives' capacity.

Use base for simple setups. Use full when you need accurate per-directory space reporting (rare).

🔴 Hard (7)

1. What is EXDEV and how does rename-exdev handle it?

Show answer EXDEV (error 18) = "Invalid cross-device link." Returned when rename() or link() crosses filesystem boundaries. Since each mergerfs branch is a separate filesystem, renaming between branches triggers EXDEV.

rename-exdev options:
- **passthrough** (default): returns EXDEV to the application
- **rel-symlink**: creates a relative symlink as workaround
- **abs-symlink**: creates an absolute symlink as workaround

Gotcha: some apps check file type after rename and reject symlinks, so test before enabling symlink modes.

2. Why must parity drives NEVER be in the mergerfs pool?

Show answer mergerfs will write user data to any branch in the pool. If the parity drive is included, user files overwrite parity data, silently destroying SnapRAID protection.

The fix is a naming convention: data at /mnt/disk1, /mnt/disk2, parity at /mnt/parity1. Glob only /mnt/disk* in mergerfs.

This is the #1 most destructive mergerfs mistake. Your entire parity protection is permanently ruined with no warning.

3. Explain mergerfs threading: read-thread-count and process-thread-count.

Show answer read-thread-count (default 0): threads reading FUSE kernel messages. With process-thread-count=-1, creates one combined thread per CPU core (max 8).

process-thread-count (default -1): threads processing messages. -1 disables separate pool (processing on read threads). 0 creates one per CPU core.

process-thread-queue-depth (default 2): max queued requests per process thread.

For high concurrency (10+ streams): set both to 0 for separate read and process pools.

4. What is the correct order for drive replacement with SnapRAID + mergerfs?

Show answer 1. Remove failed branch from pool: setfattr -n user.mergerfs.srcmounts -v '-/mnt/diskN' /storage/.mergerfs
2. Physically replace drive
3. Format replacement: mkfs.ext4 -m 0 -T largefile4 /dev/sdX1
4. Mount replacement at same path
5. Restore data: snapraid fix -d dN
6. Re-add to pool: setfattr ...srcmounts -v '+>/mnt/diskN' ...
7. THEN sync: snapraid sync

CRITICAL: Never sync before fix. Syncing after loss recalculates parity without the missing data, destroying recovery ability.

5. What does cache.writeback do and when should you enable it?

Show answer When cache.files is enabled and cache.writeback=true, the kernel aggregates small writes into larger FUSE requests before sending them to mergerfs. This can dramatically improve throughput for apps that write many small chunks.

Without writeback, each write goes through FUSE individually (writethrough). With writeback, the kernel buffers writes and flushes them in larger batches.

Enable for: database workloads, build systems, apps with many small sequential writes.
Avoid for: media streaming (use cache.files=off instead).

6. What are the link-exdev modes and when does abs-base-symlink differ from abs-pool-symlink?

Show answer link-exdev handles EXDEV errors on hard link operations:
- **passthrough** (default): returns EXDEV to caller
- **rel-symlink**: relative symlink from new to old location
- **abs-base-symlink**: absolute symlink using the underlying branch path (/mnt/disk1/...)
- **abs-pool-symlink**: absolute symlink using the pool mount path (/storage/...)

abs-base-symlink bypasses mergerfs for the link target. abs-pool-symlink goes through mergerfs, which means search policies apply when following the link.

7. Why should you disable security_capability when using cache.files?

Show answer With cache.files enabled and security_capability=true (default), the kernel sends a getxattr for security.capability before EVERY SINGLE WRITE. This xattr check is not cached, so it adds a FUSE round trip per write.

Disabling with security_capability=false returns ENOATTR immediately without the round trip. This can significantly improve write performance when file caching is enabled.

Only relevant when cache.files is not off. If cache.files=off, this has no effect.