Process Management - Street-Level Ops¶
Quick Diagnosis Commands¶
# Top processes by CPU
ps aux --sort=-%cpu | head -20
# Top processes by memory (RSS)
ps aux --sort=-rss | head -20
# Find all zombie processes
ps aux | awk '$8 == "Z" {print}'
# Find zombie parents
ps -eo pid,ppid,stat,comm | awk '$3 ~ /Z/ {print "Zombie PID:",$1,"Parent:",$2}'
# Find D-state (uninterruptible sleep) processes
ps aux | awk '$8 ~ /^D/'
# Process tree for a service
pstree -p $(pgrep -f "your-service" | head -1)
# Count open file descriptors for a process
ls /proc/$(pgrep -f "your-service" | head -1)/fd | wc -l
# Check what a process is doing RIGHT NOW
cat /proc/$(pgrep -f "your-service" | head -1)/wchan
# All threads for a process
ps -T -p $(pgrep -f "your-service" | head -1)
# Watch process state in real-time
watch -n1 'ps -eo pid,ppid,stat,%cpu,%mem,rss,comm --sort=-rss | head -30'
# Check if a process is responding (signal 0 = test only)
kill -0 1234 && echo "alive" || echo "dead"
# Find processes listening on a port
ss -tlnp | grep :8080
# Trace system calls (attach to running process)
strace -p 1234 -c # Summary of syscalls
strace -p 1234 -e trace=network # Network calls only
strace -p 1234 -e trace=file # File operations only
Gotcha: Zombie Accumulation in Containers¶
You deploy an application container and over time notice zombie processes building up:
The problem: your application spawns child processes (maybe health checks, subprocesses for requests, shell commands). The children exit, but the parent never calls wait(). In a normal Linux system, zombies get adopted by PID 1 (systemd), which reaps them. But your container's PID 1 is your application, and it does not reap adopted orphans.
Solution:
# Option 1: Use tini
RUN apt-get update && apt-get install -y tini
ENTRYPOINT ["tini", "--"]
CMD ["python", "app.py"]
# Option 2: Docker --init flag
# docker run --init your-image
# Option 3: Use dumb-init
RUN pip install dumb-init
ENTRYPOINT ["dumb-init", "--"]
CMD ["python", "app.py"]
Pattern: Hunting Zombies Systematically¶
When you see zombies, the workflow is:
# Step 1: Find zombies and their parents
ps -eo pid,ppid,stat,comm | awk '$3 ~ /Z/'
# PID PPID STAT COMMAND
# 14523 12001 Z+ defunct
# 14527 12001 Z+ defunct
# Step 2: Identify the parent
ps -p 12001 -o pid,comm,args
# PID COMMAND ARGS
# 12001 python python /app/worker.py
# Step 3: Check if parent is handling SIGCHLD
strace -p 12001 -e signal 2>&1 | grep -i chld
# Step 4: Fix the parent (or kill it to let init reap the zombies)
# If the parent is buggy and cannot be fixed immediately:
kill -TERM 12001 # Kill the parent; init adopts and reaps zombies
Gotcha: strace Changes Process Behavior¶
Attaching strace to a running process slows it down significantly — sometimes by 10-100x. In production, this can cause:
- Health check timeouts (process appears hung)
- Request timeouts (latency spikes)
- Cascading failures (load balancer marks node as unhealthy)
Safer alternatives:
# Use perf for lightweight tracing
perf trace -p 1234 --duration 5
# Use bpftrace for targeted investigation (near-zero overhead)
bpftrace -e 'tracepoint:syscalls:sys_enter_openat /pid == 1234/ { printf("%s\n", str(args->filename)); }'
# If you must use strace, limit scope and duration
timeout 10 strace -p 1234 -e trace=network -c # 10 seconds, summary only
Pattern: /proc Exploration Checklist¶
When investigating a misbehaving process, walk through /proc systematically:
PID=1234
# 1. What is it?
cat /proc/$PID/cmdline | tr '\0' ' '; echo
# 2. Where is it?
readlink /proc/$PID/cwd
readlink /proc/$PID/exe
# 3. What state is it in?
cat /proc/$PID/status | grep -E '^(Name|State|Pid|PPid|Threads|Vm)'
# 4. How much memory?
cat /proc/$PID/status | grep -E 'VmRSS|VmSize|VmSwap'
# VmRSS = actual physical memory used
# VmSize = virtual address space
# VmSwap = swapped out
# 5. How many file descriptors?
ls /proc/$PID/fd | wc -l
cat /proc/$PID/limits | grep "open files"
# Compare count against limit
# 6. What files are open?
ls -la /proc/$PID/fd | head -20
# Look for sockets, pipes, deleted files held open
# 7. What is it waiting on?
cat /proc/$PID/wchan # Kernel function
cat /proc/$PID/stack # Full kernel stack (needs root)
# 8. Network connections
cat /proc/$PID/net/tcp | awk 'NR>1 {print $2, $3, $4}'
# Or the easier way:
ss -tnp | grep "pid=$PID"
# 9. Environment
cat /proc/$PID/environ | tr '\0' '\n' | sort
Gotcha: nohup Without Proper Redirection¶
You run a long task and log out:
# WRONG — stdout goes to nohup.out in CWD, stderr may be lost
nohup ./script.sh &
# After logout, you cannot find the output because CWD was /tmp
# or the disk fills because nohup.out grows unbounded
Correct pattern:
# Explicit redirection, both streams, specific log path
nohup ./script.sh > /var/log/script-$(date +%Y%m%d).log 2>&1 &
echo $! > /var/run/script.pid
# Even better — use systemd-run for ad-hoc tasks
systemd-run --unit=my-migration --remain-after-exit /opt/scripts/migrate.sh
# Check status
systemctl status my-migration
journalctl -u my-migration -f
Pattern: Signal Chaining for Graceful Shutdown¶
Production services need to handle signals properly. Here is the pattern for wrapper scripts:
#!/bin/bash
# Wrapper that forwards signals to child process
# Start the actual application
/usr/bin/my-app --config /etc/my-app.conf &
APP_PID=$!
# Forward signals to the child
trap "kill -TERM $APP_PID" SIGTERM
trap "kill -HUP $APP_PID" SIGHUP
trap "kill -INT $APP_PID" SIGINT
# Wait for child to exit
wait $APP_PID
EXIT_CODE=$?
# Cleanup
rm -f /var/run/my-app.pid
exit $EXIT_CODE
This is critical in containers where PID 1 must forward signals to the actual application.
Gotcha: File Descriptors Leaking on Deleted Files¶
A process opens a file, then the file gets deleted. The process still holds the file descriptor, and the disk space is NOT freed:
# Find deleted files still held open
lsof +L1
# or
find /proc/*/fd -ls 2>/dev/null | grep deleted
# Example output:
# lrwx------ 1 root root 64 Mar 15 ... /proc/1234/fd/5 -> /var/log/app.log (deleted)
# Check how much space is held
ls -la /proc/1234/fd/5
# or
lsof +L1 | awk '{sum += $7} END {print sum/1024/1024 " MB"}'
# To free the space without restarting the process:
# Truncate the file descriptor (redirects to empty)
: > /proc/1234/fd/5
This is a common cause of "disk full but du shows plenty of space."
Pattern: Process Tree Analysis During Incidents¶
When a service misbehaves, map the full process tree:
# Get the service PID
SERVICE_PID=$(systemctl show -p MainPID my-service | cut -d= -f2)
# Full tree with PIDs and threads
pstree -pt $SERVICE_PID
# Resource usage per child
ps --ppid $SERVICE_PID -o pid,%cpu,%mem,rss,stat,wchan,comm
# Count children (thread/process explosion?)
ps --ppid $SERVICE_PID | wc -l
# Historical view — how many child processes have been created?
cat /proc/$SERVICE_PID/status | grep Threads
If child count is growing without bound, the service has a fork leak. It will eventually hit the nproc ulimit and fail with "Resource temporarily unavailable."
Gotcha: SIGTERM to a Shell Script Does Not Kill Children¶
#!/bin/bash
# If you SIGTERM this script, the sleep continues running as an orphan
curl -s http://slow-api/data | process_response
sleep 300
do_more_work
The shell receives SIGTERM and dies. But commands running inside the shell (curl, sleep) continue as orphaned processes.
Fix:
#!/bin/bash
trap 'kill $(jobs -p) 2>/dev/null; exit 1' SIGTERM SIGINT
curl -s http://slow-api/data | process_response
sleep 300 &
wait $!
do_more_work
Pattern: Identifying What Is Holding a Mount Busy¶
# Cannot unmount /mnt/data — "device is busy"
umount /mnt/data
# umount: /mnt/data: target is busy
# Find processes using this mount
fuser -mv /mnt/data
# USER PID ACCESS COMMAND
# /mnt/data: root 4521 ..c.. bash
# app 4890 F.... python
# Or with lsof
lsof +D /mnt/data
# Kill them gracefully, then unmount
fuser -k -TERM /mnt/data
sleep 2
umount /mnt/data
Power One-Liners¶
Intercept stdout/stderr of a running process¶
Breakdown: -e trace=write shows only write syscalls. -e write=1,2 decodes fd 1 (stdout) and fd 2 (stderr). -ff follows forks. Attaches to already-running process.
[!TIP] When to use: Process is misbehaving but wasn't started with logging. "What is this thing printing?"
Watch what files a command actually touches¶
strace -ff -e trace=file my_command 2>&1 | perl -ne 's/^[^"]+"(([^\\"]|\\[\\"nt])*)".*/$1/ && print'
Breakdown: strace -e trace=file traces all file-related syscalls (open, stat, access, unlink, etc.). The output is noisy — each syscall line includes the quoted file path among other args. The perl one-liner extracts just the first quoted string from each line (handling escaped quotes and special chars). Result: a clean list of every file the command touched.
[!TIP] When to use: "What files does this installer actually modify?" Debugging config loading order, understanding where a tool reads its settings from, verifying deployment scripts don't touch unexpected paths.
Kill the process locking a file¶
Breakdown: fuser identifies processes using a file/socket. -k sends SIGKILL to all of them. Use -i for interactive confirmation. Use without -k to just list PIDs.
[!TIP] When to use: Can't unmount a filesystem, can't delete a lock file, need to free a bound port.
Top processes by memory (no top/htop needed)¶
or with custom formatting:
[!TIP] When to use: Quick memory triage when top/htop aren't available or you need scriptable output.
Process tree with full command lines¶
Breakdown: a = all users, ww = unlimited width, f = forest (tree), u = user format, x = include no-tty. less -S enables horizontal scrolling for long lines.
[!TIP] When to use: Understanding process parent/child relationships, finding orphan processes.
List processes using the network¶
[!TIP] When to use: Security audit — what's listening? What's making outbound connections?
Quick Reference¶
- Deep Dive: Linux Process Scheduler