Portal | Level: L1: Foundations | Topics: Pipes & Redirection, Bash / Shell Scripting, Linux Fundamentals | Domain: Linux

Pipes & Redirection - Primer¶

Why This Matters¶

Every Unix command is a small program that reads input, processes it, and writes output. Pipes and redirection are the connective tissue that lets you compose these small programs into powerful data processing pipelines without writing code. They are not a convenience feature — they are the fundamental design pattern of Unix.

Who made it: The pipe concept was invented by Doug McIlroy at Bell Labs in 1973. Ken Thompson implemented it in Unix overnight after McIlroy's suggestion. McIlroy's famous philosophy: "Write programs that do one thing and do it well. Write programs to work together." The | character was chosen because it was rarely used in existing code. If you cannot fluently redirect output, split streams, and build pipelines, you cannot operate effectively on any Linux system.

File Descriptors: The Foundation¶

Every process has three standard file descriptors open at birth:

FD	Name	Default	Purpose
0	stdin	Keyboard / terminal	Input to the program
1	stdout	Terminal screen	Normal output
2	stderr	Terminal screen	Error messages, diagnostics

These are just numbers that the kernel uses to track open files. Everything in Unix is a file — terminals, pipes, sockets, actual files — and file descriptors are handles to those files.

# See the file descriptors of a running process
ls -la /proc/$$/fd
# 0 -> /dev/pts/0    (stdin  — your terminal)
# 1 -> /dev/pts/0    (stdout — your terminal)
# 2 -> /dev/pts/0    (stderr — your terminal)

Output Redirection¶

Redirect stdout to a file¶

# Write stdout to a file (creates or truncates)
echo "hello" > output.txt

# Append stdout to a file
echo "world" >> output.txt

# Command output to file
ls -la /etc > etc_listing.txt

The > operator truncates the file first. If the file does not exist, it is created. If it exists, its contents are destroyed before the command runs.

Redirect stderr to a file¶

# Redirect stderr (fd 2) to a file
find / -name "*.conf" 2> errors.txt

# Append stderr
find / -name "*.conf" 2>> errors.txt

# Discard stderr entirely
find / -name "*.conf" 2>/dev/null

Redirect both stdout and stderr¶

# Redirect both to the same file (bash)
command &> output.txt           # bash shorthand
command > output.txt 2>&1       # POSIX-compatible

# Redirect to separate files
command > stdout.txt 2> stderr.txt

# Append both to the same file
command >> output.txt 2>&1
command &>> output.txt          # bash shorthand

The 2>&1 syntax means "redirect fd 2 to wherever fd 1 currently points." This is why order matters — it must come after > if you want both in the same file.

Gotcha: command 2>&1 > file.txt and command > file.txt 2>&1 are NOT the same. In the first, stderr goes to the terminal (where stdout was pointing when 2>&1 was evaluated), then stdout goes to the file. In the second, stdout goes to the file first, then stderr follows it there. Redirections are evaluated left to right.

Input Redirection¶

# Read input from a file instead of keyboard
sort < unsorted.txt

# Combine input and output redirection
sort < unsorted.txt > sorted.txt

# Mail with body from file
mail -s "Report" admin@example.com < report.txt

Pipes¶

A pipe connects the stdout of one command to the stdin of the next. The kernel creates an in-memory buffer between them.

# Basic pipe: list files, filter for .log
ls -la /var/log | grep '.log'

# Multi-stage pipeline: find, filter, sort, count
cat access.log | grep "GET /api" | awk '{print $1}' | sort | uniq -c | sort -rn | head -10

# Pipes are concurrent — all stages run in parallel
# Data flows through the pipeline as it is produced

How Pipes Work Internally¶

When you write cmd1 | cmd2:

The shell creates an anonymous pipe (a kernel buffer, typically 64KB on Linux)
It forks cmd1 with stdout connected to the write end of the pipe
It forks cmd2 with stdin connected to the read end of the pipe
Both commands run concurrently
When cmd1 writes faster than cmd2 reads, it blocks until the buffer drains
When cmd2 reads faster than cmd1 writes, it blocks until data is available

This is producer-consumer concurrency, built into the kernel.

/dev/null and Special Files¶

# /dev/null — the bit bucket (discards everything written to it)
command > /dev/null 2>&1        # silence all output
grep "pattern" file 2>/dev/null # silence errors only

# /dev/zero — infinite stream of zero bytes
dd if=/dev/zero of=testfile bs=1M count=100   # create a 100MB test file

# /dev/urandom — infinite stream of pseudorandom bytes
dd if=/dev/urandom bs=32 count=1 2>/dev/null | base64   # generate a random token
head -c 16 /dev/urandom | xxd -p                         # 16 random hex bytes

# /dev/stdin, /dev/stdout, /dev/stderr — process self-references
echo "to stderr" > /dev/stderr
cat /dev/stdin                                            # read from own stdin

Here Documents and Here Strings¶

Here Documents (`<<`)¶

A here document feeds a block of text as stdin to a command:

# Basic here document
cat <<EOF
Hello, ${USER}.
Today is $(date).
Your home is ${HOME}.
EOF

# Quoted delimiter prevents variable expansion
cat <<'EOF'
This $VARIABLE is not expanded.
Neither is $(this command).
EOF

# Indented here document (<<- strips leading tabs only)
if true; then
    cat <<-EOF
    This line can be indented with tabs.
    The tabs are stripped from the output.
    EOF
fi

Here documents are essential for: - Generating config files in scripts - Feeding multi-line input to commands like mysql, psql, ssh - Inline test data in scripts

# Feed SQL to postgres
psql -U admin mydb <<EOF
SELECT count(*) FROM users WHERE created_at > now() - interval '1 day';
EOF

# Run commands on a remote host
ssh webserver01 <<'EOF'
systemctl status nginx
df -h /var/log
tail -5 /var/log/nginx/error.log
EOF

Here Strings (`<<<`)¶

A here string feeds a single string as stdin. Bash-only (not POSIX sh).

# Feed a string to a command
grep "error" <<< "this is an error message"

# Use with read to split a string
IFS=: read -r user _ uid gid _ home shell <<< "root:x:0:0:root:/root:/bin/bash"
echo "User: ${user}, UID: ${uid}, Home: ${home}"

# Avoid echo | command pattern
# Instead of:  echo "data" | base64
# Use:         base64 <<< "data"

Process Substitution¶

Process substitution (<() and >()) creates a temporary named pipe that looks like a filename to the command receiving it. This lets you use command output where a filename is expected.

# Compare output of two commands (diff requires filenames)
diff <(ls /etc/nginx/sites-available/) <(ls /etc/nginx/sites-enabled/)

# Compare sorted output of two database queries
diff <(psql -c "SELECT id FROM users ORDER BY id" db1) \
     <(psql -c "SELECT id FROM users ORDER BY id" db2)

# Feed multiple process outputs to a command
paste <(cut -d: -f1 /etc/passwd) <(cut -d: -f7 /etc/passwd)

# Use output substitution to write to multiple destinations
command | tee >(grep "ERROR" > errors.log) >(grep "WARN" > warnings.log) > /dev/null

Process substitution is a bash/zsh feature, not available in POSIX sh or dash.

Under the hood: Process substitution creates a /dev/fd/N file descriptor (or a named pipe on systems without /dev/fd). When you write diff <(cmd1) <(cmd2), bash creates two pipes, runs cmd1 and cmd2 with their stdout connected to those pipes, and passes the paths /dev/fd/63 and /dev/fd/62 to diff as if they were filenames.

Named Pipes (FIFOs)¶

A named pipe is a persistent pipe in the filesystem. One process writes, another reads, and data flows between them.

# Create a named pipe
mkfifo /tmp/mypipe

# Terminal 1: read from pipe (blocks until data arrives)
cat /tmp/mypipe

# Terminal 2: write to pipe (blocks until reader is connected)
echo "data from another process" > /tmp/mypipe

# Clean up
rm /tmp/mypipe

Named pipes are useful for: - Inter-process communication between unrelated processes - Feeding data between long-running daemons - Creating processing pipelines that survive across commands

# Example: persistent log filter
mkfifo /tmp/error_pipe
# Background: filter errors to a file
grep --line-buffered "ERROR" < /tmp/error_pipe > /var/log/errors_only.log &
# Foreground: application writes to the pipe
myapp > /tmp/error_pipe 2>&1

tee — Splitting Output¶

tee reads from stdin and writes to both stdout and one or more files simultaneously.

# Write to screen and file
make build 2>&1 | tee build.log

# Append instead of overwrite
command | tee -a logfile.txt

# Write to multiple files
command | tee file1.txt file2.txt file3.txt

# Use in pipelines: capture intermediate output
cat data.csv | tee raw_data.log | sort | tee sorted_data.log | uniq -c > final.txt

xargs — Stdin to Arguments¶

Many commands do not read stdin — they take arguments. xargs bridges the gap by converting stdin lines into command arguments.

# Delete all .tmp files found by find
find /tmp -name "*.tmp" -print0 | xargs -0 rm -f

# Run a command on each line of input
cat hosts.txt | xargs -I {} ssh {} "uptime"

# Parallel execution
cat urls.txt | xargs -P 4 -I {} curl -s -o /dev/null -w "%{url_effective}: %{http_code}\n" {}

# Batch arguments (pass 50 files at a time to grep)
find . -name '*.py' | xargs -n 50 grep "import"

Flag	Purpose
`-0`	Use null delimiter (pair with `find -print0`)
`-I {}`	Replace `{}` with each input line
`-n N`	Pass N arguments at a time
`-P N`	Run N parallel processes
`-L 1`	Run command once per input line

Command Substitution¶

Command substitution captures the stdout of a command and inserts it as text.

# Modern syntax: $()
current_date=$(date +%Y-%m-%d)
file_count=$(find . -name '*.py' | wc -l)
git_sha=$(git rev-parse --short HEAD)

# Legacy syntax: backticks (avoid — hard to nest, hard to read)
current_date=`date +%Y-%m-%d`

# Nesting works naturally with $()
echo "Kernel: $(uname -r) on $(hostname) ($(uname -m))"

# Nesting is painful with backticks (must escape inner backticks)
echo "Kernel: `uname -r` on `hostname`"

Always use $() syntax. Backticks are a legacy holdover that make code harder to read and impossible to nest cleanly.

Subshells and Pipes¶

Each segment of a pipeline runs in a subshell. This has critical implications for variable scope.

# Variables set in a pipeline segment are LOST after the pipeline
count=0
cat data.txt | while read -r line; do
    count=$(( count + 1 ))
done
echo "${count}"   # Prints 0 — the while loop ran in a subshell

# Fix: use process substitution to avoid the subshell
count=0
while read -r line; do
    count=$(( count + 1 ))
done < <(cat data.txt)
echo "${count}"   # Prints the correct count

# Fix: use lastpipe (bash 4.2+)
shopt -s lastpipe
count=0
cat data.txt | while read -r line; do
    count=$(( count + 1 ))
done
echo "${count}"   # Now correct — last pipe segment runs in current shell

PIPESTATUS and pipefail¶

PIPESTATUS¶

PIPESTATUS is a bash array containing the exit code of each command in the most recent pipeline.

false | true | false
echo "${PIPESTATUS[@]}"     # 1 0 1
echo "${PIPESTATUS[0]}"     # 1 (first command)
echo "${PIPESTATUS[1]}"     # 0 (second command)
echo "${PIPESTATUS[2]}"     # 1 (third command)

Without PIPESTATUS, $? only gives you the exit code of the last command in the pipeline. A failing first stage goes unnoticed.

pipefail¶

# Without pipefail: pipeline succeeds if the LAST command succeeds
false | true
echo $?                     # 0 — failure is hidden

# With pipefail: pipeline fails if ANY command fails
set -o pipefail
false | true
echo $?                     # 1 — failure is caught

Remember: The safe script header mnemonic: set -euo pipefail — Exit on error, Undefined variables are errors, O pipefail catches failures in pipelines. Memorize this as "EUO" and put it at the top of every bash script.

Every production script should use set -o pipefail. Without it, broken pipelines silently produce partial or corrupt output.

File Descriptor Manipulation¶

For advanced redirection, you can open, close, and duplicate file descriptors using exec:

# Open fd 3 for writing to a log file
exec 3> /var/log/myapp/debug.log

# Write to fd 3 throughout the script
echo "Starting process" >&3
do_work
echo "Work complete, exit code: $?" >&3

# Close fd 3
exec 3>&-

# Open fd 4 for reading
exec 4< /etc/hosts
while read -r line <&4; do
    echo "Host: ${line}"
done
exec 4<&-

# Swap stdout and stderr (advanced)
command 3>&1 1>&2 2>&3 3>&-

File descriptor manipulation is primarily useful for: - Logging to multiple destinations - Separating different output streams - Implementing progress indicators (data on stdout, progress on fd 3)

Putting It All Together¶

A real-world data processing pipeline:

#!/usr/bin/env bash
set -euo pipefail

# Process web server access logs:
# 1. Decompress rotated logs
# 2. Filter for API requests
# 3. Extract response times
# 4. Calculate statistics

{
    # Current log
    cat /var/log/nginx/access.log
    # Rotated compressed logs
    zcat /var/log/nginx/access.log.*.gz
} | grep -E 'GET /api/' \
  | awk '{print $NF}' \
  | sort -n \
  | tee >(wc -l > /tmp/total_requests.txt) \
  | awk '{
      sum += $1; count++; values[count] = $1
  } END {
      print "Total requests:", count
      print "Average:", sum/count, "ms"
      print "Median:", values[int(count/2)]
      print "P99:", values[int(count*0.99)]
      print "Max:", values[count]
  }'

This pipeline decompresses, filters, extracts, sorts, counts, and computes statistics — all without writing a single temporary file, using constant memory regardless of log size.

Advanced Bash for Ops (Topic Pack, L1) — Bash / Shell Scripting, Linux Fundamentals
Bash Exercises (Quest Ladder) (CLI) (Exercise Set, L0) — Bash / Shell Scripting, Linux Fundamentals
Environment Variables (Topic Pack, L1) — Bash / Shell Scripting, Linux Fundamentals
LPIC / LFCS Exam Preparation (Topic Pack, L2) — Bash / Shell Scripting, Linux Fundamentals
Linux Ops (Topic Pack, L0) — Bash / Shell Scripting, Linux Fundamentals
Linux Ops Drills (Drill, L0) — Bash / Shell Scripting, Linux Fundamentals
Process Management (Topic Pack, L1) — Bash / Shell Scripting, Linux Fundamentals
RHCE (EX294) Exam Preparation (Topic Pack, L2) — Bash / Shell Scripting, Linux Fundamentals
Regex & Text Wrangling (Topic Pack, L1) — Bash / Shell Scripting, Linux Fundamentals
Track: Foundations (Reference, L0) — Bash / Shell Scripting, Linux Fundamentals

Pipes & Redirection - Primer¶

Why This Matters¶

File Descriptors: The Foundation¶

Output Redirection¶

Redirect stdout to a file¶

Redirect stderr to a file¶

Redirect both stdout and stderr¶

Input Redirection¶

Pipes¶

How Pipes Work Internally¶

/dev/null and Special Files¶

Here Documents and Here Strings¶

Here Documents (`<<`)¶

Here Strings (`<<<`)¶

Process Substitution¶

Named Pipes (FIFOs)¶

tee — Splitting Output¶

xargs — Stdin to Arguments¶

Command Substitution¶

Subshells and Pipes¶

PIPESTATUS and pipefail¶

PIPESTATUS¶

pipefail¶

File Descriptor Manipulation¶

Putting It All Together¶

Wiki Navigation¶

Pages that link here¶

Pipes & Redirection - Primer¶

Why This Matters¶

File Descriptors: The Foundation¶

Output Redirection¶

Redirect stdout to a file¶

Redirect stderr to a file¶

Redirect both stdout and stderr¶

Input Redirection¶

Pipes¶

How Pipes Work Internally¶

/dev/null and Special Files¶

Here Documents and Here Strings¶

Here Documents (<<)¶

Here Strings (<<<)¶

Process Substitution¶

Named Pipes (FIFOs)¶

tee — Splitting Output¶

xargs — Stdin to Arguments¶

Command Substitution¶

Subshells and Pipes¶

PIPESTATUS and pipefail¶

PIPESTATUS¶

pipefail¶

File Descriptor Manipulation¶

Putting It All Together¶

Wiki Navigation¶

Related Content¶

Pages that link here¶

Here Documents (`<<`)¶

Here Strings (`<<<`)¶