Troubleshooting¶
Application Issues¶
Pod stuck in CrashLoopBackOff¶
kubectl logs -n grokdevops deploy/grokdevops --previous
kubectl describe pod -n grokdevops -l app.kubernetes.io/name=grokdevops
Common causes:
- Image not imported into k3s (docker save | sudo k3s ctr images import -)
- Wrong image tag in values file
- Port conflict
/metrics returns empty or errors¶
Verify prometheus-client is installed in the container:
Test the endpoint directly:
Observability Issues¶
ServiceMonitor not picked up by Prometheus¶
-
Check the ServiceMonitor exists:
-
Verify Prometheus is configured to watch all namespaces:
The install script setsserviceMonitorSelectorNilUsesHelmValues=falseto match all ServiceMonitors. -
Check Service labels match ServiceMonitor selector:
Loki not receiving logs¶
-
Check Promtail is running:
-
Check Promtail logs:
-
Verify Loki endpoint:
Grafana can't connect to data sources¶
Verify the data source URLs in Grafana match the service names:
- Prometheus: http://kube-prometheus-stack-prometheus:9090
- Loki: http://loki:3100
- Tempo: http://tempo:3200
Helm Issues¶
Helm template rendering fails¶
CRDs not found (ServiceMonitor)¶
The kube-prometheus-stack must be installed before deploying with ServiceMonitor enabled:
# Install observability stack first
./devops/scripts/install-observability.sh
# Then deploy application
./devops/scripts/deploy-local.sh
k3s Issues¶
k3s service won't start¶
kubectl commands fail¶
CI Issues¶
Shellcheck failures¶
Fix shell script issues locally:
# Install shellcheck
apt-get install shellcheck # or brew install shellcheck
# Run on all scripts
shellcheck devops/**/*.sh