Skip to content

Anti-Primer: Linux Users And Permissions

Everything that can go wrong, will — and in this story, it does.

The Setup

A sysadmin is performing a critical Linux Users And Permissions task on a production Linux server at 11 PM. The server hosts a customer-facing application with a strict SLA. The task was supposed to be routine but conditions on the server are not what the runbook assumed.

The Timeline

Hour 0: Running as Root Unnecessarily

SSHs in as root directly instead of using sudo for specific commands. The deadline was looming, and this seemed like the fastest path forward. But the result is a typo in a command path causes unintended damage to system files that a regular user could not have touched.

Footgun #1: Running as Root Unnecessarily — sSHs in as root directly instead of using sudo for specific commands, leading to a typo in a command path causes unintended damage to system files that a regular user could not have touched.

Nobody notices yet. The engineer moves on to the next task.

Hour 1: No Backup Before Change

Modifies a configuration file in-place without creating a backup copy first. Under time pressure, the team chose speed over caution. But the result is the new configuration is wrong; the original content is lost; manual reconstruction required.

Footgun #2: No Backup Before Change — modifies a configuration file in-place without creating a backup copy first, leading to the new configuration is wrong; the original content is lost; manual reconstruction required.

The first mistake is still invisible, making the next shortcut feel justified.

Hour 2: Ignoring Disk Space

Does not check available disk space before a large operation. Nobody pushed back because the shortcut looked harmless in the moment. But the result is operation fills the filesystem to 100%; logs stop writing; the application crashes.

Footgun #3: Ignoring Disk Space — does not check available disk space before a large operation, leading to operation fills the filesystem to 100%; logs stop writing; the application crashes.

Pressure is mounting. The team is behind schedule and cutting more corners.

Hour 3: Wrong Target in Destructive Command

Runs a destructive command (rm, dd, mkfs) on the wrong device or path due to a typo. The team had gotten away with similar shortcuts before, so nobody raised a flag. But the result is production data is destroyed; recovery requires restoring from backup (if one exists).

Footgun #4: Wrong Target in Destructive Command — runs a destructive command (rm, dd, mkfs) on the wrong device or path due to a typo, leading to production data is destroyed; recovery requires restoring from backup (if one exists).

By hour 3, the compounding failures have reached critical mass. Pages fire. The war room fills up. The team scrambles to understand what went wrong while the system burns.

The Postmortem

Root Cause Chain

# Mistake Consequence Could Have Been Prevented By
1 Running as Root Unnecessarily A typo in a command path causes unintended damage to system files that a regular user could not have touched Primer: Use sudo for specific commands; disable direct root SSH login
2 No Backup Before Change The new configuration is wrong; the original content is lost; manual reconstruction required Primer: Always cp file file.bak before editing; use version control for config files
3 Ignoring Disk Space Operation fills the filesystem to 100%; logs stop writing; the application crashes Primer: Check df -h before any operation that writes significant data
4 Wrong Target in Destructive Command Production data is destroyed; recovery requires restoring from backup (if one exists) Primer: Double-check targets for destructive commands; use --dry-run where available

Damage Report

  • Downtime: 1-3 hours of server or service unavailability
  • Data loss: Risk of filesystem corruption or configuration loss
  • Customer impact: Service errors if the affected server hosts customer-facing workloads
  • Engineering time to remediate: 6-12 engineer-hours for diagnosis, repair, and verification
  • Reputation cost: Ops team confidence shaken; runbook updates required

What the Primer Teaches

  • Footgun #1: If the engineer had read the primer, section on running as root unnecessarily, they would have learned: Use sudo for specific commands; disable direct root SSH login.
  • Footgun #2: If the engineer had read the primer, section on no backup before change, they would have learned: Always cp file file.bak before editing; use version control for config files.
  • Footgun #3: If the engineer had read the primer, section on ignoring disk space, they would have learned: Check df -h before any operation that writes significant data.
  • Footgun #4: If the engineer had read the primer, section on wrong target in destructive command, they would have learned: Double-check targets for destructive commands; use --dry-run where available.

Cross-References