Prometheus¶
Portal | All Topics | Domain: Observability | Tier: 1
Also known as: metrics, promql, service-monitor, alertmanager, monitoring, prometheusstack, datadog
📘 Topic Packs¶
| Asset | Level | Time | Status |
|---|---|---|---|
| Alerting Rules | L2 | 4h | solid |
| Capacity Planning | L2 | 4h | solid |
| Monitoring Fundamentals | L1 | 5h | solid |
| Monitoring Migration (Legacy to Modern) | L2 | 5h | solid |
| Observability Deep Dive | L2 | 6h | solid |
| OpenTelemetry | L2 | 5h | solid |
| Prometheus Deep Dive | L2 | 5h | solid |
🏋️ Exercise Sets¶
| Asset | Level | Time | Status |
|---|---|---|---|
| Incident Simulator (18 scenarios) (CLI) | L2 | 4h | solid |
🔬 Labs¶
| Asset | Level | Time | Status |
|---|---|---|---|
| Lab: Prometheus Target Down (CLI) | L2 | 30m | solid |
⚡ Drills¶
| Asset | Level | Time | Status |
|---|---|---|---|
| Alerting Rules Drills | L2 | 30m | solid |
| Observability Drills | L2 | 30m | solid |
| PromQL Drills | L2 | 30m | solid |
🎯 Scenarios¶
| Asset | Level | Time | Status |
|---|---|---|---|
| Adversarial Interview Gauntlet (30 sequences) | L2 | 15h | solid |
| Interview: Prometheus Target Down | L2 | 15m | solid |
📋 Runbooks¶
| Asset | Level | Time | Status |
|---|---|---|---|
| Runbook: Alert Storm (Flapping / Too Many Alerts) | L2 | 15m | solid |
| Runbook: Prometheus Target Down | L1 | 15m | solid |
📝 Assessments¶
| Asset | Level | Time | Status |
|---|---|---|---|
| Skillcheck: Alerting Rules | L2 | 30m | solid |
| Skillcheck: Observability | L2 | 30m | solid |
📎 References¶
| Asset | Level | Time | Status |
|---|---|---|---|
| Observability Architecture | L2 | 30m | solid |
| Track: Observability | L2 | 10h | solid |
🔍 Case Studys¶
| Asset | Level | Time | Status |
|---|---|---|---|
| Case Study: Disk Full — Runaway Logs, Fix Is Loki Retention | L2 | 30m | solid |
| Case Study: Grafana Dashboard Empty — Prometheus Blocked by NetworkPolicy | L2 | 30m | solid |
| Ops Archaeology: The Alerts That Stopped Firing | L2 | 30m | solid |
| Ops Archaeology: The Slow Death Nobody Noticed | L2 | 30m | solid |
Related Topics¶
- Grafana (7 shared assets)
- Loki (6 shared assets)
- Tempo (4 shared assets)
- Alerting Rules (4 shared assets)
- Linux Fundamentals (3 shared assets)
- Monitoring Fundamentals (2 shared assets)
- Kubernetes Core (2 shared assets)
- Monitoring Migration (1 shared asset)
- Capacity Planning (1 shared asset)
- SRE Practices (1 shared asset)