Skip to content

GrokDevOps Wiki

Overview

grokdatum/grokdevops

Observability Domain¶

Domain guide: browse all observability content organized as a learning sequence.

Curated path through monitoring, alerting, logging, and tracing content.

Topics¶

Resource	Level	Description
Monitoring Fundamentals	L0	Core concepts: metrics, logs, traces, SLIs/SLOs
Alerting Rules	L1	Writing and tuning alerts that don't page at 3 AM
Log Pipelines	L1	Collecting, shipping, and querying logs at scale
Observability Deep Dive	L2	Cardinality, tail sampling, correlation strategies
OpenTelemetry	L2	OTel SDK, Collector, auto-instrumentation
Monitoring Migration	L2	Moving from Nagios/Zabbix to Prometheus/Grafana

Runbooks¶

Resource	Description
Prometheus Target Down	When targets drop from scrape
Loki No Logs	Log pipeline is silent
Tempo No Traces	Traces not arriving

Practice¶

Resource	Type
Observability Cheatsheet	Quick reference
Observability Architecture Guide	Full architecture walkthrough
Skillcheck: Observability	Self-assessment
Skillcheck: Alerting Rules	Self-assessment

Where to Start¶

Begin with Monitoring Fundamentals to build vocabulary, then work through Alerting Rules and Log Pipelines. The deep dive and OpenTelemetry topics are L2 and assume you have run a Prometheus + Grafana stack before.

Pages that link here¶