Skip to content

mergerfs - Primer

Why This Matters

If you manage a Linux media server, NAS, or data hoarding setup with multiple drives, you face a fundamental problem: how do you present 4, 8, or 20 individual drives as a single usable storage pool without the rigidity and rebuild overhead of hardware RAID? mergerfs solves this by acting as a FUSE-based union filesystem that logically combines multiple filesystem paths into one mount point. Files physically live on individual drives with their native filesystems (ext4, XFS, whatever you want), but applications see a single unified directory tree.

mergerfs is the backbone of the "Perfect Media Server" pattern: mergerfs pools your drives, SnapRAID provides parity protection, and Docker containers serve media from the merged mount. Understanding mergerfs deeply -- especially its policies -- is essential for anyone building Linux storage infrastructure outside the enterprise RAID/ZFS paradigm.

Etymology and Background

  • Creator: Antonio SJ Musumeci, known by the handle trapexit on GitHub
  • License: ISC (extremely permissive -- two-clause, similar to MIT/BSD, allows personal and commercial use)
  • Name origin: mergerfs = "merger filesystem" -- it merges multiple filesystem branches into one
  • Repository: github.com/trapexit/mergerfs
  • Current release: v2.41.1 (November 2025)
  • Written in: C++ with a custom libfuse implementation
  • Not to be confused with: mhddfs (older, abandoned), unionfs-fuse (different approach), OverlayFS (kernel-level, read-write overlay on read-only base)

Core Concepts

1. FUSE Architecture

mergerfs runs in userspace via FUSE (Filesystem in Userspace). The kernel receives filesystem syscalls, routes them through the FUSE kernel module to the mergerfs daemon, which proxies them to the appropriate underlying filesystem.

Application
    |
    v
VFS (kernel)
    |
    v
FUSE kernel module
    |
    v
mergerfs daemon (userspace)
    |
    v
/mnt/disk1 (ext4)  /mnt/disk2 (xfs)  /mnt/disk3 (ext4)

Performance implication: every I/O operation involves a kernel-to-userspace context switch. For large sequential reads/writes the overhead is minimal (a few percent). For metadata-heavy workloads (millions of small files), the overhead is more noticeable. The fuse_msg_size option (default 256 pages = 1 MiB) controls how much data moves per FUSE message -- doubling the message size roughly halves the number of round trips.

2. Branches and the Pool

A branch is a filesystem path that participates in the mergerfs pool. The pool is the mergerfs mount point -- the unified view of all branches.

# Three branches forming a pool
/mnt/disk1:/mnt/disk2:/mnt/disk3  /storage  fuse.mergerfs  defaults  0  0

Branches are specified as a colon-delimited list. Globbing is supported:

/mnt/disk*  /storage  fuse.mergerfs  defaults  0  0

Critical gotcha: shell glob expansion will eat your glob patterns. In fstab this is safe (no shell expansion), but on the command line you must escape:

mergerfs /mnt/disk\*:/mnt/ssd /storage

3. Branch Modes: RW, RO, NC

Each branch operates in one of three modes:

Mode Name Create policies Action policies Search policies
RW Read-Write Yes Yes Yes
RO Read-Only No No Yes
NC No-Create No Yes Yes

Syntax: append =MODE to the branch path.

/mnt/disk1=RW:/mnt/disk2=NC:/mnt/disk3=RO  /storage  fuse.mergerfs  defaults  0  0

When to use NC: you want an older/smaller drive to stop receiving new files but still allow modifications and deletions of existing files. This is the graceful way to drain a drive before removal.

When to use RO: the branch is a read-only mount (e.g., a snapshot or archive). mergerfs respects this and excludes it from both create and action operations.

Per-branch minfreespace: you can set a space threshold per branch that overrides the global minfreespace:

/mnt/disk1=RW,50G:/mnt/disk2=RW,100G  /storage  fuse.mergerfs  minfreespace=20G  0  0

Here disk1 reserves 50 GiB and disk2 reserves 100 GiB, regardless of the global 20 GiB setting.

4. Functions, Categories, and Policies -- The Core of mergerfs

This is the most important section. mergerfs routes every filesystem operation through a policy that decides which branch(es) to act on. Understanding this is the difference between a working pool and a confusing mess.

Functions

Every filesystem syscall maps to a function. Functions are grouped into three categories:

Action (modify existing files/dirs): chmod, chown, link, removexattr, rename, rmdir, setxattr, truncate, unlink, utimens

Create (make new files/dirs): create, mkdir, mknod, symlink

Search (find/read existing files): access, getattr, getxattr, ioctl (directories), listxattr, open, readlink

N/A (operate on already-open file handles, cannot use branch-selection policies): fchmod, fchown, futimens, ftruncate, fallocate, fgetattr, fsync, ioctl (files), read, readdir, release, statfs, write, copy_file_range

Mnemonic for Categories -- "CAS"

Create, Action, Search -- the three policy categories. "CAS" like CAS latency in RAM -- every access has a lookup cost.

Policies

Every policy answers: "given these branches, which one(s) should this operation target?"

Core policies (used with any category):

Policy Mnemonic Description
pfrd Probabilistic Free-space Random Distribution Random selection weighted by available space. More free space = higher probability. Default create policy.
mfs Most Free Space Pick the branch with the most available bytes
lfs Least Free Space Pick the branch with the least available bytes
lus Least Used Space Pick the branch with the least used bytes
lup Least Used Percentage Pick the branch with the lowest usage percentage
ff First Found Pick the first branch in mount order. Default search policy.
rand Random Equal-probability random selection
all All Apply to all branches (for mkdir/mknod/symlink creates dirs everywhere; for create acts like ff)
newest Newest Pick the branch with the largest mtime on the target file/dir

Existing-path (ep*) variants -- only consider branches where the parent path already exists:

Policy Base Description
epmfs mfs Among branches with existing parent path, pick most free space
eplfs lfs Among existing paths, pick least free space
eplus lus Among existing paths, pick least used space
eppfrd pfrd Among existing paths, weighted random by free space
epff ff Among existing paths, pick first found. Default action policy (via epall).
eprand rand Among existing paths, random selection
epall all Apply to all branches with existing path. Default action policy.

Most-specific-path (msp*) variants -- like ep* but if the exact path does not exist, retries with the parent directory:

Policy Base Description
mspmfs epmfs Like epmfs but walks up to parent dir if needed
msplfs eplfs Like eplfs but walks up to parent dir if needed
msplus eplus Like eplus but walks up to parent dir if needed
msppfrd eppfrd Like eppfrd but walks up to parent dir if needed

Mnemonic for Policy Prefixes

ep = "Existing Path" -- only branches that already have the directory msp = "Most Specific Path" -- tries exact path first, falls back to parent No prefix = considers all branches regardless of existing paths

Default Policies

Category Default Policy Why
Create pfrd Distributes new files probabilistically by free space -- natural balancing
Action epall Applies chmod/rename/delete to all branches where the file exists
Search ff Returns the first match in mount order -- fast for lookups

Overriding Policies: func.* and category.*

You can override at two granularities:

Category-level (all functions in that category):

category.create=mfs
category.action=epall
category.search=ff

Function-level (one specific function):

func.mkdir=all
func.create=epmfs
func.rename=newest

Function-level overrides take precedence over category-level.

5. Decision Framework: Which Create Policy When?

Use Case Recommended Policy Why
Media server (Plex/Jellyfin) epmfs or mfs Keep series/seasons on one drive for locality; epmfs preserves paths
General NAS pfrd (default) Natural balancing across drives proportional to free space
Download staging (Sonarr/Radarr) mfs Spread writes to the drive with most room, avoid filling any single drive
Even fill lup Keeps percentage usage balanced across drives of different sizes
Path preservation (archives) epmfs New files in an existing directory go to the same branch
Performance (SSD tier) ff with SSD branch listed first Always writes to the fast drive first

6. Key Configuration Options

Space Management

  • minfreespace=SIZE (default: 4G): branches with less than this free space are excluded from create policies. Does NOT affect action or search.
  • moveonenospc=BOOL|POLICY (default: false): when a write fails with ENOSPC on the current branch, mergerfs moves the file to another branch using the specified policy and retries the write. Only affects writes, NOT creates.

Cache Behavior

  • dropcacheonclose=BOOL (default: false): calls posix_fadvise(DONTNEED) when a file is closed, telling the kernel to drop its page cache for that file. Essential for Plex/Jellyfin -- without it, streaming large media files pollutes the page cache and causes stale metadata.
  • cache.files=off|partial|full|auto-full|per-process (default: off): controls page caching.
  • off: no caching (safest for multi-writer scenarios)
  • partial: cached while the file handle is open
  • full: cached across opens
  • auto-full: cached if mtime/size unchanged between opens
  • per-process: selective caching by process name
  • cache.entry=UINT (default: 1): seconds to cache file existence queries
  • cache.negative-entry=UINT (default: 1): seconds to cache "file does not exist" responses
  • cache.attr=UINT (default: 1): seconds to cache stat() results (permissions, size, timestamps)
  • cache.statfs=UINT (default: 0): seconds to cache df/statfs results. Used by policies that check free space. Set higher (e.g., 10-60) to reduce overhead if free space does not change rapidly.
  • cache.symlinks=BOOL (default: false): kernel caches readlink() results (Linux 4.20+)
  • cache.readdir=BOOL (default: false): kernel caches directory listings (Linux 4.20+)
  • cache.writeback=BOOL (default: false): aggregates small writes into larger FUSE requests. Can dramatically improve throughput for apps that write inefficiently. Requires cache.files to be enabled.

Cross-Device Operations

  • rename-exdev=passthrough|rel-symlink|abs-symlink (default: passthrough): when a rename crosses branches (EXDEV error), mergerfs can create a symlink instead. passthrough returns the error to the application. rel-symlink and abs-symlink create relative or absolute symlinks as workarounds.
  • link-exdev=passthrough|rel-symlink|abs-base-symlink|abs-pool-symlink (default: passthrough): same concept for hard link operations that cross branches.

What is EXDEV? Error code 18 (POSIX), meaning "Invalid cross-device link." Returned when rename() or link() tries to move a file across filesystem boundaries. In mergerfs, each branch is a separate filesystem, so renaming a file from one branch to another triggers EXDEV.

Special Options

  • symlinkify=BOOL (default: false): after symlinkify_timeout seconds, read-only files are reported as symlinks instead of regular files. Useful for NFS exports and deduplication.
  • symlinkify_timeout=UINT (default: 3600): seconds before symlinkify activates
  • nullrw=BOOL (default: false): turns reads/writes into no-ops. For benchmarking FUSE overhead only.
  • nfsopenhack=off|git|all (default: off): workaround for NFS export issues where creating files for write with read-only mode bits fails.
  • xattr=passthrough|noattr|nosys (default: passthrough): controls xattr forwarding. nosys tells the kernel to stop asking (cached forever) but disables runtime control via the .mergerfs pseudo-file.
  • statfs=base|full (default: base): base uses all branches for df calculations; full only includes branches where the queried path exists.
  • statfs_ignore=none|ro|nc (default: none): exclude read-only or no-create branches from statfs results.
  • security_capability=BOOL (default: true): when false, returns ENOATTR for security.capability xattr queries. Disabling prevents the kernel from sending a getxattr before every single write when caching is enabled.
  • posix_acl=BOOL (default: false): enable POSIX ACL support
  • async_read=BOOL (default: true): allow multiple concurrent read requests per file handle

7. Threading Model

mergerfs uses a configurable thread pool to handle FUSE messages:

  • read-thread-count (default: 0): threads that read messages from the FUSE kernel channel. 0 with process-thread-count=-1 creates one combined read/process thread per CPU core (max 8). Negative values scale as CPUCount / -N.
  • process-thread-count (default: -1): threads that process messages. -1 disables a separate pool (processing happens on read threads). 0 creates one thread per CPU core (max 8).
  • process-thread-queue-depth (default: 2): how many requests can queue per processing thread before new requests block. Max outstanding = queue-depth x process-thread-count.

Rule of thumb: the defaults work well for most setups. If you have many concurrent readers (10+ Plex streams), increase read-thread-count to match your core count.

8. Runtime Configuration via .mergerfs Pseudo-File

mergerfs exposes a hidden pseudo-file at the mount point: /mountpoint/.mergerfs. It does not show up in directory listings but can be accessed directly. Through extended attributes on this file, you can query and modify nearly any mergerfs setting at runtime without unmounting.

Query all settings:

getfattr -d /storage/.mergerfs
# or on macOS:
xattr -l /storage/.mergerfs

Read a specific setting:

getfattr -n user.mergerfs.category.create /storage/.mergerfs
# Returns: user.mergerfs.category.create="pfrd"

Change a setting at runtime:

setfattr -n user.mergerfs.category.create -v mfs /storage/.mergerfs

Add branches at runtime (no unmount needed):

setfattr -n user.mergerfs.srcmounts -v '+>/mnt/disk4:/mnt/disk5' /storage/.mergerfs

Remove branches at runtime:

setfattr -n user.mergerfs.srcmounts -v '-/mnt/disk3' /storage/.mergerfs

Key point: runtime changes are NOT persisted. They revert on next mount. To persist, update /etc/fstab.

Even if xattr=nosys is set globally, xattr operations against .mergerfs still work -- the pseudo-file is always accessible.

9. Interaction with NFS, Samba, and Docker

NFS exports: mergerfs works with NFS but requires care: - Set nfsopenhack=git or nfsopenhack=all if clients see errors creating files - Each export needs a unique fsid=N in /etc/exports - Use crossmnt for subtree exports - Consider symlinkify=true for better NFS performance on static content

Samba: works out of the box. Point Samba shares at the mergerfs mount point, not individual branches. Set use_ino in mergerfs options for consistent inode numbers.

Docker bind mounts: mount the mergerfs pool into containers:

volumes:
  - /storage/media:/data/media
  - /storage/downloads:/data/downloads
The container sees a unified view. Run containers with matching UID/GID (typically PUID=1000, PGID=1000).

10. What mergerfs is NOT

  • Not RAID: no striping, no mirroring, no parity. Zero fault tolerance on its own.
  • Not a redundancy layer: losing a drive means losing the files on that drive. Pair with SnapRAID or backups.
  • Not a volume manager: does not manage block devices. Works above the filesystem layer.
  • Not OverlayFS: no whiteouts, no copy-up semantics. All branches are peers.
  • Not a rebalancer: does not automatically move files between branches. Policies only affect new file creation.
  • Not ZFS/btrfs: no checksums, no snapshots, no scrubbing. It is a thin routing layer.

Quick Reference

# Mount with common options
mergerfs /mnt/disk\*:/mnt/ssd /storage \
  -o defaults,allow_other,use_ino,cache.files=off,\
     moveonenospc=true,dropcacheonclose=true,\
     minfreespace=50G,category.create=mfs,fsname=mergerfs

# fstab equivalent
/mnt/disk*  /storage  fuse.mergerfs  defaults,allow_other,use_ino,cache.files=off,moveonenospc=true,dropcacheonclose=true,minfreespace=50G,category.create=mfs,fsname=mergerfs  0  0

# Check current settings
getfattr -d /storage/.mergerfs

# Change create policy at runtime
setfattr -n user.mergerfs.category.create -v epmfs /storage/.mergerfs

# Add a drive to the pool without unmounting
setfattr -n user.mergerfs.srcmounts -v '+>/mnt/disk5' /storage/.mergerfs

# Remove a drive from the pool
setfattr -n user.mergerfs.srcmounts -v '-/mnt/disk3' /storage/.mergerfs

# Check which branch a file lives on
getfattr -n user.mergerfs.fullpath /storage/path/to/file

# Check free space per branch
df -h /mnt/disk*
Default Value
Create policy pfrd
Action policy epall
Search policy ff
minfreespace 4G
cache.entry 1s
cache.attr 1s
cache.negative-entry 1s
cache.statfs 0s
cache.files off
moveonenospc false
dropcacheonclose false
fuse_msg_size 256 (1 MiB)

Wiki Navigation

Prerequisites

  • Mergerfs Flashcards (CLI) (flashcard_deck, L1) — mergerfs