find - Footguns & Pitfalls¶
Mistakes that cause data loss, security exposure, slow operations, or wrong results. Each one has bitten someone in production.
1. Forgetting to quote -name patterns (shell glob expansion)¶
You type:
The shell expands *.log against files in your current directory before find ever sees it. If your cwd contains app.log and error.log, find actually receives:
This either errors out ("paths must precede expression") or silently matches only one file instead of the pattern you intended.
Fix: Always quote glob patterns.
2. Using -exec without proper termination (\; or +)¶
-exec requires explicit termination. The {} placeholder is replaced with the filename, and the command boundary must be marked.
Fix: Terminate with \; (one command per file) or + (batched).
# Per-file execution
find /app -name "*.pyc" -exec rm {} \;
# Batched execution (faster)
find /app -name "*.pyc" -exec rm {} +
The semicolon must be escaped (\;) or quoted (';') because the shell interprets bare ; as a command separator.
3. Not using -print0 with filenames containing spaces or special characters¶
A file named my vacation photo.jpg gets split into three arguments: my, vacation, photo.jpg. The rm deletes the wrong things (or fails).
A file named file$'\n'name.txt (contains a literal newline) causes even worse breakage — xargs interprets it as two separate filenames.
Fix: Always use -print0 with xargs -0.
find /uploads -type f -print0 | xargs -0 rm
# Or use -exec, which handles filenames correctly by design
find /uploads -type f -exec rm {} +
4. Misunderstanding -prune (forgetting -print)¶
# Intent: find *.py files but skip .git directories
find /project -name ".git" -prune -o -name "*.py"
This prints .git in the output because find's default -print applies to the entire expression. When the left side of -o matches (the .git directory), -prune returns true but there is no explicit -print — the default prints it anyway.
Fix: Add -print explicitly to the right side of -o.
The pattern is: <prune-condition> -prune -o <real-condition> -print.
5. Wrong -delete ordering (deleting parent directories before children)¶
By default, find processes directories before their contents. -delete on a non-empty directory fails. However, -delete implies -depth (process contents before the directory), so the above actually works for empty directories. The real footgun is combining -delete with other predicates incorrectly:
# DANGER: this deletes ALL matching files AND directories, depth-first
find /data -name "*.tmp" -delete
# Fine for files, but if a directory happens to be named "foo.tmp", it gets deleted too
Fix: Be explicit about type, and always dry-run first.
# Dry run: see what would be deleted
find /data -name "*.tmp" -type f -print
# Then delete
find /data -name "*.tmp" -type f -delete
For directory cleanup, process in depth order:
6. -mtime vs -mmin confusion¶
-mtime -2 means "modified less than 2 days ago" (48 hours), not 2 hours.
The time argument is in 24-hour periods for -mtime and in minutes for -mmin.
Fix: Use -mmin for sub-day precision.
# Files changed in the last 2 hours
find /var/log -mmin -120
# Files changed in the last 2 days
find /var/log -mtime -2
Additional confusion: -mtime 2 does not mean "2 days ago." It means "between 48 and 72 hours ago" (the 2-day bracket). Use + and - for ranges:
- -mtime +7 — more than 7 days ago
- -mtime -7 — less than 7 days ago
- -mtime 7 — between 7 and 8 days ago (rarely what you want)
7. Symlink following (-L) surprises¶
With -L, find follows symlinks as if they were real directories. A symlink pointing to / turns your /app search into a full-system scan. A symlink pointing to an NFS mount hangs your find on a network timeout. A circular symlink causes infinite recursion (though modern find detects this).
Fix: Use -L only when you specifically need to follow symlinks, and combine it with -xdev to prevent escaping the filesystem.
# Follow symlinks but stay on the same filesystem
find -L /app -xdev -name "*.conf"
# Default behavior (do not follow symlinks) is usually what you want
find /app -name "*.conf"
# To find the symlinks themselves
find /app -type l
8. Missing -maxdepth causing slowness on large trees¶
# You only need files in /var/log itself, but this recurses into every subdirectory
find /var/log -name "syslog"
On a server with years of rotated logs (thousands of files in nested date directories), this takes minutes instead of milliseconds.
Fix: Constrain depth when you know the target location.
# Search only the immediate directory
find /var/log -maxdepth 1 -name "syslog"
# Search two levels deep at most
find /opt -maxdepth 2 -name "config.yaml"
Performance note: Place -maxdepth before other predicates. GNU find warns if it comes after tests.
9. -perm mode syntax confusion¶
Three different syntaxes produce very different results:
# Exact match: permissions are exactly 0644
find /srv -type f -perm 0644
# At-least match: all these bits must be set (AND logic)
find /srv -type f -perm -0644
# Any-bit match: any of these bits are set (OR logic)
find /srv -type f -perm /0644
The common mistake:
# Intent: find files that are world-writable
find / -type f -perm 0002
# Reality: this only matches files with permissions exactly 0002 (------w-)
# Almost nothing matches this
Fix: Use -perm /mode for "any of these bits set" and -perm -mode for "all of these bits set."
# Files where the world-write bit is set (regardless of other bits)
find / -type f -perm /0002
# Files where owner, group, and other all have read permission
find / -type f -perm -0444
Also: the old +mode syntax (BSD) is not supported on GNU find. Use /mode instead.
10. find . vs find / — running from the wrong starting point¶
Without a starting path, find defaults to . (current directory). If your cwd is /home/user, you search only /home/user. You conclude the file doesn't exist when it is sitting in /etc/nginx/.
On the other hand, running find / on a production server with millions of files, NFS mounts, and pseudo-filesystems (/proc, /sys) takes forever and produces thousands of permission-denied errors.
Fix: Always specify the starting path explicitly. Use -xdev on / to avoid crossing mount points.
# Search the whole root filesystem (not NFS, not /proc)
find / -xdev -name "nginx.conf" 2>/dev/null
# Search specific likely locations instead of /
find /etc /usr/local/etc /opt -name "nginx.conf" 2>/dev/null
11. -newer with wrong or missing reference file¶
# Intent: find files newer than yesterday's backup
find /data -newer /backup/last_backup_marker
# But the marker file doesn't exist
# find: '/backup/last_backup_marker': No such file or directory
# The entire find command fails, and your script silently does nothing
Worse: the marker file exists but has the wrong timestamp (your backup script never updated it). Now your "incremental" search matches nothing, and you think no files changed.
Fix: Validate the reference file before using it. Use touch -t to create precise reference timestamps.
# Create an explicit reference timestamp
touch -t 202603180000 /tmp/ref_yesterday
find /data -newer /tmp/ref_yesterday -type f
# Validate in a script
MARKER="/backup/last_backup_marker"
if [ ! -f "$MARKER" ]; then
echo "ERROR: marker file missing: $MARKER" >&2
exit 1
fi
find /data -newer "$MARKER" -type f
12. Using -delete on a find command with -o (OR) logic¶
This works, but a subtle rearrangement breaks it:
# DANGER: -delete applies to the entire expression, not just the OR group
find /data -name "*.tmp" -o -name "*.bak" -delete
Without parentheses, operator precedence means -delete only applies to the -name "*.bak" branch. The *.tmp files are found but not deleted — they just get printed by the default action. You think the cleanup worked, but half the files remain.
Fix: Always use parentheses with -o, and always dry-run first.
# Correct: parentheses group the OR
find /data \( -name "*.tmp" -o -name "*.bak" \) -delete
# Always test first
find /data \( -name "*.tmp" -o -name "*.bak" \) -print
13. Forgetting -type f when using -size (matching directories too)¶
This also matches directories whose allocated size exceeds 100M (directories with many entries can have large directory files). You see directories in your output and try to gzip them, which fails.
Fix: Always include -type f when filtering by size.
14. Running find on /proc or /sys and wondering why it hangs or produces garbage¶
This descends into /proc (pseudo-filesystem representing processes) and /sys (kernel tunables). Some files in /proc block on read, some files in /sys are enormous virtual files, and traversal is slow and pointless.
Fix: Exclude virtual filesystems with -prune or use -xdev.
# Option 1: prune specific paths
find / -path /proc -prune -o -path /sys -prune -o -path /dev -prune -o \
-name "*.conf" -print
# Option 2: stay on the root filesystem
find / -xdev -name "*.conf"
15. Assuming -exec preserves your shell environment¶
# This does NOT work — the pipe is interpreted by find, not the shell
find /data -name "*.csv" -exec head -1 {} | sort \;
-exec runs a single command, not a shell pipeline. The | is interpreted by your current shell, not passed to -exec.
Fix: Invoke a shell explicitly when you need pipelines or shell features.
# Use sh -c for shell features inside -exec
find /data -name "*.csv" -exec sh -c 'head -1 "$1"' _ {} \;
# Or pipe the entire find output
find /data -name "*.csv" -exec head -1 {} + | sort
# For complex per-file logic, use a while loop
find /data -name "*.csv" -print0 | while IFS= read -r -d '' f; do
head -1 "$f" | tr ',' '\t'
done
The _ {} pattern in sh -c: the _ is $0 (script name placeholder), and {} becomes $1 (the actual filename).