Portal | Level: L1: Foundations | Topics: xargs, Bash / Shell Scripting | Domain: CLI Tools
xargs - Primer¶
Why This Matters¶
Every DevOps engineer eventually hits a wall where a simple for loop cannot scale. You have 10,000 files to process, 500 API endpoints to health-check, or 200 containers to stop. Shell loops are sequential and slow. Piping raw output into commands breaks on whitespace. This is where xargs becomes essential.
xargs reads items from standard input, splits them into arguments, and executes a command with those arguments. It handles batching, parallelism, and argument length limits that shell loops cannot. It is one of the oldest Unix tools (1977), runs everywhere, and remains the fastest path from "I have a list of things" to "I did something to all of them."
Name origin:
xargsstands for "eXtended ARGumentS." It was written by Herb Gellis and first appeared in PWB/UNIX 1.0 (Programmer's Workbench) at Bell Labs in July 1977. It solved a real problem: shell command lines had a maximum length, andxargscould split work across multiple invocations automatically.
If you work in production, you will use xargs to bulk-delete stale containers, mass-rename files during migrations, parallel-curl health endpoints, process log archives, and chain together find/grep/awk pipelines. Knowing xargs well saves hours of scripting.
Core Concepts¶
Basic Usage: Turning stdin into Arguments¶
At its simplest, xargs takes lines from stdin and appends them as arguments to a command:
# Delete all .tmp files listed in a manifest
cat cleanup-list.txt | xargs rm
# Without xargs, equivalent to:
# rm file1.tmp file2.tmp file3.tmp ...
By default, xargs reads whitespace-delimited tokens from stdin and appends as many as will fit into a single command invocation. If the argument list exceeds ARG_MAX (the OS limit), xargs automatically splits across multiple invocations.
# Count lines across all Python files in a project
find /opt/app -name '*.py' | xargs wc -l
# This runs: wc -l file1.py file2.py file3.py ...
# If there are thousands of files, xargs splits into multiple wc calls
# and you see subtotals per batch
-I {} : Replacement Strings for Precise Placement¶
The -I flag lets you control exactly where each argument is inserted. This is critical when the argument is not the last thing on the command line:
# Copy each config file to a backup location
find /etc/nginx/conf.d -name '*.conf' | xargs -I {} cp {} /backup/nginx/
# Rename .bak files back to their original names
ls *.bak | xargs -I {} bash -c 'mv "{}" "${1%.bak}"' _ {}
# Create a timestamped backup of each file
find /var/log -name '*.log' -mtime +30 | \
xargs -I {} bash -c 'gzip -c "{}" > "/backup/logs/$(basename {}).$(date +%Y%m%d).gz"'
# Curl each endpoint from a list
cat endpoints.txt | xargs -I {} curl -sf -o /dev/null -w "%{http_code} {}\n" {}
When -I is used, xargs processes one argument per command invocation by default (equivalent to -L 1). This is slower but gives you full placement control.
You can use any replacement string, not just {}:
-0 : Null-Delimited Input for Safety¶
Filenames with spaces, quotes, or newlines break default xargs behavior. The -0 flag tells xargs to split on null bytes (\0) instead of whitespace:
# Safe deletion of files with spaces in names
find /data/uploads -name '*.tmp' -print0 | xargs -0 rm -f
# Safe processing of files from a null-delimited manifest
tr '\n' '\0' < file-list.txt | xargs -0 -I {} mv {} /archive/
# find -print0 + xargs -0 is the canonical safe pattern
find /var/log -name '*.gz' -mtime +90 -print0 | xargs -0 rm -f
Always pair find -print0 with xargs -0. This is non-negotiable in production scripts where filenames come from users or external systems.
Remember: Mnemonic: "print-Zero, xargs-Zero" -- the flags rhyme and always travel together. If you use one without the other, you'll get silent data corruption on filenames with spaces.
-P : Parallel Execution¶
The -P flag runs multiple processes concurrently. This is the single biggest performance win xargs offers:
# Compress log files using 8 parallel gzip processes
find /var/log/archive -name '*.log' -print0 | xargs -0 -P 8 -I {} gzip {}
# Parallel health checks against 200 endpoints
cat endpoints.txt | xargs -P 20 -I {} curl -sf -m 5 -o /dev/null -w "%{http_code} {}\n" {}
# Parallel image pulls (4 at a time)
cat images.txt | xargs -P 4 -I {} docker pull {}
# Parallel SSH commands across a fleet
cat servers.txt | xargs -P 10 -I {} ssh -o ConnectTimeout=5 {} 'df -h / | tail -1'
# Use nproc to match CPU count
find . -name '*.wav' -print0 | xargs -0 -P "$(nproc)" -I {} ffmpeg -i {} {}.mp3
-P 0 means "as many as possible" but use this cautiously — it can fork-bomb a system or overwhelm a remote API. Start with a reasonable number (4-20) and increase if you need more throughput.
Gotcha:
-P(parallel) combined with-I {}processes one item per invocation. This means-P 8 -I {}runs 8 concurrent processes, each handling one item. Without-I,-P 8runs 8 concurrent processes, each handling a batch of items -- much more efficient for commands that accept multiple arguments likermorgzip.
-n : Batch Sizing¶
The -n flag limits how many arguments are passed per invocation. This is useful when a command only accepts a fixed number of arguments, or when you want to control batch size:
# Pass 2 arguments at a time (for commands that take pairs)
echo "src1 dst1 src2 dst2 src3 dst3" | xargs -n 2 mv
# Process files in batches of 100 (useful for APIs with batch limits)
find /data -name '*.json' | xargs -n 100 aws s3 cp --recursive
# Delete in batches of 50 (avoids ARG_MAX on very long lists)
cat stale-keys.txt | xargs -n 50 redis-cli DEL
# Combine with -P for parallel batches
find . -name '*.test.js' | xargs -n 10 -P 4 npx jest
Without -n, xargs packs as many arguments as possible into each invocation (up to ARG_MAX). With -n, you control the tradeoff between fewer process forks (large batches) and more predictable behavior (small batches).
-L : Line-Based Processing¶
While -n splits on arguments (whitespace-delimited tokens), -L splits on lines. This matters when your input has spaces within items:
# Process one line at a time (each line may contain spaces)
cat directories-with-spaces.txt | xargs -L 1 du -sh
# Run a command for every 3 lines of input
cat batch-commands.txt | xargs -L 3 some-batch-tool
# -L 1 is equivalent to -I {} without a replacement string
cat hosts.txt | xargs -L 1 ping -c 1
Combining with find¶
The find | xargs pipeline is the most common xargs usage pattern:
# Find and delete core dumps older than 7 days
find /var/crash -name 'core.*' -mtime +7 -print0 | xargs -0 rm -f
# Find large files and show their sizes
find / -size +100M -print0 2>/dev/null | xargs -0 ls -lhS
# Find config files and search for a deprecated setting
find /etc -name '*.conf' -print0 | xargs -0 grep -l 'SSLv3'
# Find and chmod only files (not directories)
find /var/www -type f -print0 | xargs -0 chmod 644
# Find empty directories and remove them
find /tmp -type d -empty -print0 | xargs -0 rmdir
# Find recently modified files and create a tarball
find /opt/app -mmin -60 -type f -print0 | xargs -0 tar czf recent-changes.tar.gz
Combining with grep¶
# Find files containing a pattern, then extract specific lines
grep -rl 'deprecated_function' /opt/app/src/ | xargs grep -n 'deprecated_function'
# Find files with TODO comments and count them per file
grep -rl 'TODO' src/ | xargs -I {} sh -c 'echo "$(grep -c TODO "{}") {}"' | sort -rn
# Search and replace across files (using sed)
grep -rl 'old-api.internal' /etc/nginx/ | xargs sed -i 's/old-api\.internal/new-api\.internal/g'
Combining with curl¶
# Health check all endpoints in parallel
cat urls.txt | xargs -P 10 -I {} curl -sf -m 5 -o /dev/null -w "%{http_code} %{url_effective}\n" {}
# Download a list of URLs
cat download-urls.txt | xargs -P 5 -I {} wget -q -P /data/downloads/ {}
# POST a batch of payloads to an API
ls payloads/*.json | xargs -P 4 -I {} curl -sf -X POST -H "Content-Type: application/json" -d @{} https://api.internal/ingest
Processing stdin with echo and printf¶
# Generate a sequence and process it
seq 1 100 | xargs -P 10 -I {} curl -sf "https://api.internal/page/{}"
# Process a comma-separated list
echo "web,api,worker,scheduler" | tr ',' '\n' | xargs -I {} kubectl rollout restart deployment/{}
# Generate test data
seq 1 1000 | xargs -I {} echo "INSERT INTO test_table VALUES ({}, 'row_{}');" | psql mydb
Production Examples¶
Rolling restart of Kubernetes deployments by label¶
# Restart all deployments in the payments namespace
kubectl get deploy -n payments -o name | xargs -I {} kubectl rollout restart {} -n payments
# Wait for each to finish rolling out
kubectl get deploy -n payments -o name | \
xargs -I {} kubectl rollout status {} -n payments --timeout=300s
Parallel log collection from a fleet¶
# Collect the last 100 lines of syslog from 50 servers, 10 at a time
cat servers.txt | xargs -P 10 -I {} \
ssh -o ConnectTimeout=5 -o StrictHostKeyChecking=no {} \
'tail -100 /var/log/syslog' > /tmp/fleet-logs/{}.log 2>/dev/null
Bulk S3 operations¶
# Delete all objects matching a prefix (handles thousands of keys)
aws s3api list-objects-v2 --bucket my-bucket --prefix old-data/ --query 'Contents[].Key' --output text | \
tr '\t' '\n' | xargs -P 8 -I {} aws s3 rm "s3://my-bucket/{}"
# Copy local files to S3 in parallel
find /data/exports -name '*.csv' -print0 | \
xargs -0 -P 8 -I {} aws s3 cp {} s3://data-lake/imports/
Database maintenance¶
# Reindex all tables in a PostgreSQL database
psql -d mydb -t -c "SELECT tablename FROM pg_tables WHERE schemaname='public'" | \
xargs -I {} psql -d mydb -c "REINDEX TABLE {};"
# Dump individual tables in parallel
psql -d mydb -t -c "SELECT tablename FROM pg_tables WHERE schemaname='public'" | \
xargs -P 4 -I {} pg_dump -d mydb -t {} -f /backup/tables/{}.sql
Container cleanup¶
# Remove all stopped containers
docker ps -aq --filter status=exited | xargs -r docker rm
# Remove dangling images
docker images -q --filter dangling=true | xargs -r docker rmi
# Kill containers matching a pattern
docker ps -q --filter name=test- | xargs -r docker kill
# Remove old images not used in the last 30 days (by creation date)
docker images --format '{{.ID}} {{.CreatedAt}}' | \
awk '$2 < "2026-02-17" {print $1}' | xargs -r docker rmi -f
Certificate expiry scanning¶
# Check SSL expiry for all domains in a list
cat domains.txt | xargs -P 10 -I {} bash -c '
expiry=$(echo | openssl s_client -servername {} -connect {}:443 2>/dev/null | \
openssl x509 -noout -enddate 2>/dev/null | cut -d= -f2)
echo "{}: ${expiry:-FAILED}"
'
Quick Reference¶
| Flag | Purpose | Example |
|---|---|---|
-I {} |
Replace {} with each input item |
xargs -I {} mv {} /dest/ |
-0 |
Split on null bytes (pair with find -print0) |
find . -print0 \| xargs -0 rm |
-P N |
Run N processes in parallel | xargs -P 8 -I {} gzip {} |
-n N |
Pass N arguments per invocation | xargs -n 100 rm |
-L N |
Pass N lines per invocation | xargs -L 1 du -sh |
-r |
Do nothing if stdin is empty (GNU) | xargs -r rm |
-t |
Print each command before running (trace) | xargs -t rm |
-p |
Prompt before each execution | xargs -p rm |
--max-procs |
Long form of -P |
xargs --max-procs=4 |
--max-args |
Long form of -n |
xargs --max-args=50 |
-d '\n' |
Use newline as delimiter (GNU) | xargs -d '\n' rm |
--no-run-if-empty |
Long form of -r |
xargs --no-run-if-empty rm |
-s N |
Max command line length in bytes | xargs -s 4096 echo |
Key Defaults¶
- Delimiter: whitespace (spaces, tabs, newlines)
- Command if none given:
/bin/echo - Max arguments: limited by
ARG_MAX(typically 2MB on Linux, check withgetconf ARG_MAX) - Parallelism: 1 (sequential)
- Empty input: runs the command once with no args (use
-rto prevent)
Default trap: On GNU/Linux,
xargswithout-rruns the command even on empty input. On macOS/BSD,-ris the default. This meansecho "" | xargs rmon Linux will runrmwith no arguments (harmless but confusing), while the same command on macOS does nothing. Always use-rin portable scripts.
Common Patterns Cheat Sheet¶
# Safe file processing (the gold standard)
find <path> <predicates> -print0 | xargs -0 <command>
# Parallel processing
<list> | xargs -P <N> -I {} <command> {}
# Batch processing
<list> | xargs -n <batch_size> <command>
# Safe parallel file processing (all three combined)
find <path> <predicates> -print0 | xargs -0 -P <N> -I {} <command> {}
# Dry run (print commands without executing)
<list> | xargs -t -p <command>
Wiki Navigation¶
Prerequisites¶
- Linux Ops (Topic Pack, L0)
Related Content¶
- Advanced Bash for Ops (Topic Pack, L1) — Bash / Shell Scripting
- Bash Exercises (Quest Ladder) (CLI) (Exercise Set, L0) — Bash / Shell Scripting
- Bash Flashcards (CLI) (flashcard_deck, L1) — Bash / Shell Scripting
- Cron & Job Scheduling (Topic Pack, L1) — Bash / Shell Scripting
- Environment Variables (Topic Pack, L1) — Bash / Shell Scripting
- Fleet Operations at Scale (Topic Pack, L2) — Bash / Shell Scripting
- LPIC / LFCS Exam Preparation (Topic Pack, L2) — Bash / Shell Scripting
- Linux Ops (Topic Pack, L0) — Bash / Shell Scripting
- Linux Ops Drills (Drill, L0) — Bash / Shell Scripting
- Linux Text Processing (Topic Pack, L1) — Bash / Shell Scripting