Portal | Level: L2: Operations | Topics: On-Call & Incident Command, Alerting Rules | Domain: DevOps & Tooling
On-Call¶
Level: L2 | Domain: devops
On-call is the practice of having designated engineers available to respond to production incidents outside of normal working hours. Effective on-call requires clear escalation policies, well-maintained runbooks, actionable alerts, and a culture that respects responder well-being.
Understanding when to page, how to triage severity, and how to hand off incidents are foundational skills for any engineer participating in an on-call rotation.
See Also¶
- Incident Triage
- Incident Psychology
- SRE Practices
- Alerting Rules
- Decision Tree: Should I Page Someone?
Pages that link here¶
- Alerting Rules - Skill Check
- Alerting Rules Drills
- Decision Tree: Should I Page Someone?
- Incident Command & On-Call
- Runbook Craft
- Runbook: Alert Storm (Flapping / Too Many Alerts)
- SRE Practices
- SRE Practices - Primer
- Symptoms: Alert Storm, Caused by Flapping Health Checks, Fix Is Probe Tuning
- The Psychology of Incidents
Wiki Navigation¶
Related Content¶
- Alerting Flashcards (CLI) (flashcard_deck, L1) — Alerting Rules
- Alerting Rules (Topic Pack, L2) — Alerting Rules
- Alerting Rules Drills (Drill, L2) — Alerting Rules
- Case Study: Alert Storm — Flapping Health Checks (Case Study, L2) — Alerting Rules
- Incident Command & On-Call (Topic Pack, L2) — On-Call & Incident Command
- On Call Flashcards (CLI) (flashcard_deck, L1) — On-Call & Incident Command
- Runbook Craft (Topic Pack, L1) — On-Call & Incident Command
- Runbook: Alert Storm (Flapping / Too Many Alerts) (Runbook, L2) — Alerting Rules
- SRE Practices (Topic Pack, L2) — Alerting Rules
- Skillcheck: Alerting Rules (Assessment, L2) — Alerting Rules