Skip to content

Observability

Portal | Tag Cloud | 53 assets across 16 topics

Prometheus, Grafana, Loki, alerting, tracing, OpenTelemetry, SLOs


Core Skills (Tier 1)

Start here — these are the topics a mid-level engineer uses weekly.


All Content

📘 Topic Packs

Asset Topics Level Time Status
Alerting Rules Alerting Rules, Prometheus L2 4h solid
Capacity Planning Prometheus L2 4h solid
Continuous Profiling Continuous Profiling L2 4h solid
Linux Logging Logging L1 4h solid
Log Pipelines Log Pipelines, Logging, Loki L2 5h solid
Mental Models (Core Concepts) Observability Deep Dive L0 4h solid
Monitoring Fundamentals Grafana, Monitoring Fundamentals, Prometheus L1 5h solid
Monitoring Migration (Legacy to Modern) Grafana, Monitoring Fundamentals, Monitoring Migration L2 5h solid
Observability Deep Dive Grafana, Loki, Prometheus L2 6h solid
On-Call Alerting Rules L2 1h stub
OpenTelemetry OpenTelemetry, Prometheus, Tracing L2 5h solid
Prometheus Deep Dive Prometheus, Prometheus Deep Dive L2 5h solid
SLO Tooling SLO Tooling L2 4h solid
SRE Practices Alerting Rules L2 5h solid
Synthetic Monitoring Synthetic Monitoring L1 3h solid
Tracing Tracing L1 2h solid
perf Profiling Tracing L2 2h solid
strace Tracing L1 2h solid

🏋️ Exercise Sets

Asset Topics Level Time Status
Incident Simulator (18 scenarios) (CLI) Loki, Prometheus L2 4h solid

🔬 Labs

Asset Topics Level Time Status
Lab: Loki No Logs (CLI) Loki L2 30m solid
Lab: Prometheus Target Down (CLI) Grafana, Prometheus L2 30m solid

⚡ Drills

Asset Topics Level Time Status
Alerting Rules Drills Alerting Rules, Prometheus L2 30m solid
LogQL Drills Loki L2 30m solid
Observability Drills Loki, Prometheus L2 30m solid
PromQL Drills Prometheus L2 30m solid

🎯 Scenarios

Asset Topics Level Time Status
Adversarial Interview Gauntlet (30 sequences) Prometheus L2 15h solid
Interview: Loki Logs Disappeared Loki L2 15m solid
Interview: Prometheus Target Down Prometheus L2 15m solid

📋 Runbooks

Asset Topics Level Time Status
Runbook: Alert Storm (Flapping / Too Many Alerts) Alerting Rules, Prometheus L2 15m solid
Runbook: Grafana Dashboard Blank / No Data Grafana L1 15m solid
Runbook: Log Pipeline Backpressure / Logs Not Appearing Log Pipelines, Loki L2 15m solid
Runbook: Loki No Logs Loki L2 15m solid
Runbook: Prometheus Target Down Prometheus L1 15m solid
Runbook: Tempo No Traces Tempo L2 15m solid

📝 Assessments

Asset Topics Level Time Status
Skillcheck: Alerting Rules Alerting Rules, Prometheus L2 30m solid
Skillcheck: Observability Grafana, Loki, Prometheus L2 30m solid

📎 References

Asset Topics Level Time Status
Observability Architecture Grafana, Loki, Prometheus L2 30m solid
Track: Observability Grafana, Loki, Prometheus L2 10h solid

🔍 Case Studys

Asset Topics Level Time Status
Case Study: Alert Storm — Flapping Health Checks Alerting Rules L2 30m solid
Case Study: Disk Full — Runaway Logs, Fix Is Loki Retention Prometheus L2 30m solid
Case Study: Grafana Dashboard Empty — Prometheus Blocked by NetworkPolicy Prometheus L2 30m solid
Ops Archaeology: The Alerts That Stopped Firing Prometheus L2 30m solid
Ops Archaeology: The Slow Death Nobody Noticed Prometheus L2 30m solid