Drill: Debug a Pod Stuck in Pending State¶
Goal¶
Systematically diagnose why a pod is stuck in Pending state by checking scheduling constraints, resources, and events.
Setup¶
- kubectl configured with cluster access
- A pod that is in Pending state (or create one with impossible resource requests to practice)
Commands¶
Confirm the pod is Pending:
Check pod events for scheduling failures:
Check node resources available:
Check if node selectors or affinities are too restrictive:
kubectl get pod <pod-name> -o jsonpath='{.spec.nodeSelector}'
kubectl get pod <pod-name> -o yaml | grep -A 10 affinity
Check for taints that may prevent scheduling:
Check tolerations on the pod:
Check if PersistentVolumeClaims are bound:
Check resource quotas in the namespace:
What to Look For¶
Insufficient cpuorInsufficient memoryin events means no node has enough capacitynode(s) didn't match Pod's node affinity/selectormeans label or affinity mismatchnode(s) had taints that the pod didn't toleratemeans taint/toleration mismatchpersistentvolumeclaim not foundorunboundmeans storage is not available- Quota exceeded means the namespace resource quota is full
Common Mistakes¶
- Only checking pod events and not checking node-level capacity
- Forgetting to check PVC binding status for stateful pods
- Not checking for taints on newly added nodes
- Overlooking namespace resource quotas that silently prevent scheduling
Cleanup¶
No cleanup needed if you are inspecting an existing pod. Delete any test pods you created.