Observability Domain¶
Domain guide: browse all observability content organized as a learning sequence.
Curated path through monitoring, alerting, logging, and tracing content.
Topics¶
| Resource | Level | Description |
|---|---|---|
| Monitoring Fundamentals | L0 | Core concepts: metrics, logs, traces, SLIs/SLOs |
| Alerting Rules | L1 | Writing and tuning alerts that don't page at 3 AM |
| Log Pipelines | L1 | Collecting, shipping, and querying logs at scale |
| Observability Deep Dive | L2 | Cardinality, tail sampling, correlation strategies |
| OpenTelemetry | L2 | OTel SDK, Collector, auto-instrumentation |
| Monitoring Migration | L2 | Moving from Nagios/Zabbix to Prometheus/Grafana |
Runbooks¶
| Resource | Description |
|---|---|
| Prometheus Target Down | When targets drop from scrape |
| Loki No Logs | Log pipeline is silent |
| Tempo No Traces | Traces not arriving |
Practice¶
| Resource | Type |
|---|---|
| Observability Cheatsheet | Quick reference |
| Observability Architecture Guide | Full architecture walkthrough |
| Skillcheck: Observability | Self-assessment |
| Skillcheck: Alerting Rules | Self-assessment |
Where to Start¶
Begin with Monitoring Fundamentals to build vocabulary, then work through Alerting Rules and Log Pipelines. The deep dive and OpenTelemetry topics are L2 and assume you have run a Prometheus + Grafana stack before.