Quiz: On-Call & Incident Command¶
4 questions
L0 (1 questions)¶
1. What is the role of an Incident Commander (IC) during a production outage?
Show answer
The IC coordinates the response: declares severity, opens the war room, assigns roles (comms lead, technical lead), sets investigation direction, approves risky actions, calls for status updates, and declares resolution. The IC does NOT debug the problem directly.L1 (1 questions)¶
1. What is the difference between SEV-1 and SEV-2 incidents, and how does the response differ?
Show answer
SEV-1 means major customer impact (service down or data loss) requiring all-hands, war room, and exec notification. SEV-2 means significant degradation for many users, requiring on-call team plus service owners. SEV-1 gets statuspage + customer email; SEV-2 gets a statuspage update. *Common mistake:* People often over-classify incidents as SEV-1 when they are really SEV-2 (degraded but not down).L2 (1 questions)¶
1. Your on-call rotation has 4 engineers and one of them gets paged 3x more than others. What systemic issues might cause this and how do you fix it?
Show answer
Likely causes: (1) Noisy alerts for services they own, (2) alert routing rules that over-map to one person, (3) follow-the-sun gaps, (4) uneven service ownership. Fix: audit alert frequency per person, tune noisy alerts, rebalance service ownership across the team, ensure fair rotation scheduling.L3 (1 questions)¶
1. During a SEV-1, the IC assigns you as technical lead. You discover the root cause is a bad deploy, but rolling back will cause 5 minutes of additional downtime. The current state is degraded but serving 60% of traffic. What framework do you use to decide?