Anti-Primer: Advanced Bash¶
Everything that can go wrong, will — and in this story, it does.
The Setup¶
A platform team is writing a critical deployment script the night before a major release. The script must orchestrate database migrations, service restarts, and health checks across 50 servers. The senior engineer is on vacation, leaving a mid-level dev to handle it solo.
The Timeline¶
Hour 0: Unquoted Variables Wreck Paths¶
Uses $DEPLOY_DIR without quotes in an rm -rf command. The deadline was looming, and this seemed like the fastest path forward. But the result is word splitting on a path with spaces deletes unintended directories.
Footgun #1: Unquoted Variables Wreck Paths — uses
$DEPLOY_DIRwithout quotes in anrm -rfcommand, leading to word splitting on a path with spaces deletes unintended directories.
Nobody notices yet. The engineer moves on to the next task.
Hour 1: Missing set -euo pipefail¶
Script has no error handling; a failed curl health check is silently ignored. Under time pressure, the team chose speed over caution. But the result is deployment continues to next phase with unhealthy services.
Footgun #2: Missing set -euo pipefail — script has no error handling; a failed
curlhealth check is silently ignored, leading to deployment continues to next phase with unhealthy services.
The first mistake is still invisible, making the next shortcut feel justified.
Hour 2: Pipe Swallows Exit Codes¶
Chains grep | wc -l to check readiness but only checks exit code of wc. Nobody pushed back because the shortcut looked harmless in the moment. But the result is reports success even when grep found zero matches.
Footgun #3: Pipe Swallows Exit Codes — chains
grep | wc -lto check readiness but only checks exit code ofwc, leading to reports success even when grep found zero matches.
Pressure is mounting. The team is behind schedule and cutting more corners.
Hour 3: Subshell Variable Scope¶
Sets a flag inside a while read loop piped from another command; variable never propagates. The team had gotten away with similar shortcuts before, so nobody raised a flag. But the result is rollback condition never triggers despite detection logic being correct.
Footgun #4: Subshell Variable Scope — sets a flag inside a
while readloop piped from another command; variable never propagates, leading to rollback condition never triggers despite detection logic being correct.
By hour 3, the compounding failures have reached critical mass. Pages fire. The war room fills up. The team scrambles to understand what went wrong while the system burns.
The Postmortem¶
Root Cause Chain¶
| # | Mistake | Consequence | Could Have Been Prevented By |
|---|---|---|---|
| 1 | Unquoted Variables Wreck Paths | Word splitting on a path with spaces deletes unintended directories | Primer: Variable quoting and shellcheck |
| 2 | Missing set -euo pipefail | Deployment continues to next phase with unhealthy services | Primer: Strict mode and error trapping |
| 3 | Pipe Swallows Exit Codes | Reports success even when grep found zero matches | Primer: PIPEFAIL and explicit exit code checks |
| 4 | Subshell Variable Scope | Rollback condition never triggers despite detection logic being correct | Primer: Process substitution and variable scoping rules |
Damage Report¶
- Downtime: 1-3 hours of server or service unavailability
- Data loss: Risk of filesystem corruption or configuration loss
- Customer impact: Service errors if the affected server hosts customer-facing workloads
- Engineering time to remediate: 6-12 engineer-hours for diagnosis, repair, and verification
- Reputation cost: Ops team confidence shaken; runbook updates required
What the Primer Teaches¶
- Footgun #1: If the engineer had read the primer, section on unquoted variables wreck paths, they would have learned: Variable quoting and shellcheck.
- Footgun #2: If the engineer had read the primer, section on missing set -euo pipefail, they would have learned: Strict mode and error trapping.
- Footgun #3: If the engineer had read the primer, section on pipe swallows exit codes, they would have learned: PIPEFAIL and explicit exit code checks.
- Footgun #4: If the engineer had read the primer, section on subshell variable scope, they would have learned: Process substitution and variable scoping rules.
Cross-References¶
- Primer — The right way
- Footguns — The mistakes catalogued
- Street Ops — How to do it in practice