Strace¶

20 cards — 🟢 5 easy | 🟡 7 medium | 🔴 2 hard

🟢 Easy (5)¶

1. What does strace show?

Show answer

System calls made by a process, their arguments, and return values. Every interaction between a program and the kernel (file opens, network connects, memory allocation) goes through syscalls — strace intercepts and logs them. Even a simple ls makes dozens of syscalls. Start with strace -e trace=file ls to see just the file-related ones.

2. Why can strace work without source code or knowledge of the programming language?

Show answer

Because it observes the kernel system call interface, which all programs must use regardless of language. Whether the code is Python, Go, or a static C binary, it must ask the kernel to open files, send network packets, and allocate memory. strace hooks at this universal boundary, making it language-agnostic.

3. What does the -f flag do in strace?

Show answer

Follows child processes created by fork/clone, tracing them along with the parent. Without -f, you only see the parent's syscalls and miss what child processes do. Essential for debugging multi-process programs like web servers (Apache/Nginx), shell pipelines, or anything that spawns subprocesses.

Remember: "strace -p PID = attach to running process." Ctrl+C to detach. Add -f to follow child processes.

Gotcha: strace adds significant overhead — don't use on production for extended periods.

4. How do you attach strace to an already-running process?

Show answer

Use strace -p . This attaches to the process without restarting it — ideal for debugging a hung service in production. You will need root or matching UID. Combine with -f to also trace child processes. Detach cleanly with Ctrl+C; the process continues running normally after you detach.

5. How do you save strace output to a file?

Show answer

Use -o flag: strace -o trace.log . This separates strace output from the program's own stderr, making both easier to read. For multi-process traces, add -ff to create one file per PID: strace -ff -o trace produces trace.1234, trace.1235, etc.

Remember: "strace -c = syscall summary." It counts calls, errors, and time per syscall. Great for finding which syscall dominates.

🟡 Medium (7)¶

1. What does -e trace=file do in strace?

Show answer

Restricts output to only file-related syscalls (open, stat, access, unlink, etc.), filtering out everything else. This dramatically reduces noise when debugging "file not found" or permission problems. Other useful filters: trace=network for socket issues, trace=process for fork/exec debugging.

Remember: "-e trace=network shows socket syscalls (connect, accept, send, recv). -e trace=file shows file operations (open, read, write, stat)."

2. What are two very useful error codes to grep for in strace output?

Show answer

ENOENT (file not found) and EACCES (permission denied). Run strace -o trace.log then grep -E 'ENOENT|EACCES' trace.log. ENOENT reveals missing config files, libraries, or certificates. EACCES exposes permission problems on files, directories, or sockets. These two errors explain the majority of "it works on my machine" issues.

3. What does "connect(...) = -1 ECONNREFUSED" in strace output indicate?

Show answer

The process tried to establish a network connection but the target host/port refused it — typically meaning nothing is listening on that port. Check: is the target service running? Is the port correct? Is a firewall blocking? This single strace line often pinpoints connectivity issues faster than reading application logs.

4. How can you identify what a stalled process is waiting on using strace?

Show answer

Attach with strace -p and look for repeated calls to futex, poll, epoll_wait, read, or recvfrom. These indicate the process is blocked waiting for a lock, I/O, or network data. Add -T to see time spent in each call — a 30-second read() on a socket tells you the remote end is slow. This is often faster than log analysis.

5. How do you add timestamps and per-call duration to strace output?

Show answer

Use -tt for microsecond wall-clock timestamps and -T for time spent in each syscall (shown at line end).
Example: strace -tt -T -o trace.log . The -T output reveals which syscalls are slow — a connect() taking 5 seconds points to DNS or network issues. Combine with grep to find the slowest calls.

6. What are the main trace filter categories in strace?

Show answer

trace=file (open, stat, chmod — file operations), trace=network (socket, connect, sendto — network calls), trace=process (fork, exec, exit — process lifecycle), trace=signal (signal delivery), trace=memory (mmap, brk — memory allocation). These let you focus on specific problem domains and cut through noise.

7. What does the execve syscall tell you in strace output?

Show answer

It shows which program was executed, with what arguments and environment variables.
Example: execve("/usr/bin/python3", ["python3", "app.py"], [...]) reveals the exact binary and arguments. This is invaluable for debugging shell scripts, cron jobs, or systemd services that launch child programs with unexpected arguments or wrong paths.

🔴 Hard (2)¶

1. What are the main limitations of strace?

Show answer

Output can be huge and noisy for high-volume workloads. It only shows kernel boundary crossings — not language-level variables, function calls, or application logic. Tracing adds significant overhead (10-100x slowdown) via ptrace, which can perturb timing-sensitive programs and mask race conditions. For lower overhead, consider eBPF-based tools like bpftrace.

2. What is a practical strace debugging workflow?

Show answer

1. Start narrow — filter to file or network calls (-e trace=file).
2. Log to a file with -o to separate from program output.
3. Grep for ENOENT, EACCES, ECONNREFUSED — the most common culprits.
4. Correlate with timestamps using -tt -T to find slow calls.
5. Widen the filter only if the narrow trace was inconclusive. Avoid unfiltered strace on busy processes — the output is overwhelming.