Skip to content

Portal | Level: L2: Operations | Topics: Prometheus, Grafana, Loki, Tempo | Domain: Observability

Track: Observability

Prometheus, Loki, Tempo, Grafana. Metrics, logs, traces.

Goals

  • Understand the three pillars (metrics, logs, traces)
  • Configure and troubleshoot Prometheus scrape targets
  • Debug log pipeline (Promtail -> Loki -> Grafana)
  • Understand trace propagation (OpenTelemetry -> Tempo)
  • Use Grafana for correlation across signals
  • Know SLI/SLO concepts and alerting fundamentals

Prerequisites

  • Concepts: kubernetes, service, deployment, daemonset
  • make deploy-all completed (observability stack running)

Primary Path (12 steps)

  1. Read: training/library/skillchecks/observability.skillcheck.md — three pillars mental model
  2. Read: devops/docs/observability.md — stack architecture
  3. Run: kubectl get pods -n monitoring — verify stack is running
  4. Run: kubectl port-forward -n monitoring svc/kube-prometheus-stack-grafana 3000:80 — access Grafana
  5. Study: devops/observability/values/values-prometheus.yaml — Prometheus config
  6. Study: ServiceMonitor: kubectl get servicemonitor -n grokdevops -o yaml
  7. Lab: training/interactive/runtime-labs/lab-runtime-03-observability-target-down/ — break/fix Prometheus target
  8. Read: training/library/runbooks/prometheus_target_down.md — triage procedure
  9. Lab: training/interactive/runtime-labs/lab-runtime-04-loki-no-logs/ — break/fix log pipeline
  10. Read: training/library/runbooks/observability/loki_no_logs.md — log pipeline triage
  11. Read: training/library/runbooks/observability/tempo_no_traces.md — tracing triage
  12. Study: training/knowledge_architecture/commands/observability_debugging_flow.md — decision tree

Optional Deepening


Wiki Navigation

Prerequisites

Next Steps