Lab 7: Pod Debugging¶
| Field | Value |
|---|---|
| Tier | 2 — Kubernetes Core |
| Estimated Time | 45 minutes |
| Prerequisites | k3s cluster, kubectl |
| Auto-Grade | Yes |
Scenario¶
You inherit a Kubernetes namespace from a departing colleague. It contains ten pod specifications that were supposed to form a working microservices stack. Every single one is broken in a different way. Some fail to pull images because the tag is wrong. Others crash immediately due to OOM kills. Several have misconfigured probes that cause endless restart loops. A few reference secrets or ConfigMaps that do not exist. One has an RBAC issue that prevents it from reading a required API resource.
Your manager wants all ten pods green by end of day. You need to diagnose each
failure using kubectl describe, kubectl logs, and kubectl events, then fix the
underlying issue. No pod should be restarting, crash-looping, or stuck in Pending.
Objectives¶
- Fix the pod with ImagePullBackOff (wrong image tag)
- Fix the pod with OOMKilled (memory limit too low)
- Fix the pod with failing liveness probe (wrong port)
- Fix the pod with failing readiness probe (wrong path)
- Fix the pod referencing a missing Secret
- Fix the pod referencing a missing ConfigMap
- Fix the pod with wrong command (entrypoint error)
- Fix the pod stuck in Pending (missing node selector label)
- Fix the pod with security context issues (read-only root filesystem)
- All 9 pods are Running and Ready (not restarting)
Setup¶
Creates namespace lab-pod-debug with 9 broken pod specifications.
Hints¶
Hint 1: Diagnosing pod issues
Start with `kubectl get pods -n lab-pod-debug` to see the state, then `kubectl describe podHint 2: ImagePullBackOff
Check the image name and tag in the pod spec. Use `kubectl describe` to see the exact pull error. Common fix: correct the tag to a valid version.Hint 3: OOMKilled
The container needs more memory. Increase `.resources.limits.memory`. Check `kubectl describe pod` for the OOM event and last known memory usage.Hint 4: Missing Secret/ConfigMap
Create the missing resource. Check the pod spec for the expected name and keys. `kubectl create secret genericHint 5: Pending pods
`kubectl describe pod` will show scheduling failures. If it says "node selector didn't match", either add the label to a node or remove the selector from the pod.Grading¶
Solution¶
See the solution/ directory for fixes for each broken pod.