Skip to content

Observability Domain

Domain guide: browse all observability content organized as a learning sequence.

Curated path through monitoring, alerting, logging, and tracing content.

Topics

Resource Level Description
Monitoring Fundamentals L0 Core concepts: metrics, logs, traces, SLIs/SLOs
Alerting Rules L1 Writing and tuning alerts that don't page at 3 AM
Log Pipelines L1 Collecting, shipping, and querying logs at scale
Observability Deep Dive L2 Cardinality, tail sampling, correlation strategies
OpenTelemetry L2 OTel SDK, Collector, auto-instrumentation
Monitoring Migration L2 Moving from Nagios/Zabbix to Prometheus/Grafana

Runbooks

Resource Description
Prometheus Target Down When targets drop from scrape
Loki No Logs Log pipeline is silent
Tempo No Traces Traces not arriving

Practice

Resource Type
Observability Cheatsheet Quick reference
Observability Architecture Guide Full architecture walkthrough
Skillcheck: Observability Self-assessment
Skillcheck: Alerting Rules Self-assessment

Where to Start

Begin with Monitoring Fundamentals to build vocabulary, then work through Alerting Rules and Log Pipelines. The deep dive and OpenTelemetry topics are L2 and assume you have run a Prometheus + Grafana stack before.