Anti-Primer: Kubernetes Pods And Scheduling¶

Everything that can go wrong, will — and in this story, it does.

The Setup¶

A developer is deploying a memory-intensive machine learning inference service to a shared Kubernetes cluster. They need GPU nodes but the cluster has limited GPU capacity. The deployment must be live by end of day for a demo.

The Timeline¶

Hour 0: No Resource Requests¶

Deploys pods without resource requests to 'let Kubernetes figure it out'. The deadline was looming, and this seemed like the fastest path forward. But the result is pods land on nodes with insufficient memory; OOMKilled repeatedly; scheduler has no information to make good decisions.

Footgun #1: No Resource Requests — deploys pods without resource requests to 'let Kubernetes figure it out', leading to pods land on nodes with insufficient memory; OOMKilled repeatedly; scheduler has no information to make good decisions.

Nobody notices yet. The engineer moves on to the next task.

Hour 1: Wrong Node Selector¶

Uses a nodeSelector label that does not match any GPU nodes in the cluster. Under time pressure, the team chose speed over caution. But the result is pods stay Pending forever with 'no nodes available' events; engineer thinks the cluster is broken.

Footgun #2: Wrong Node Selector — uses a nodeSelector label that does not match any GPU nodes in the cluster, leading to pods stay Pending forever with 'no nodes available' events; engineer thinks the cluster is broken.

The first mistake is still invisible, making the next shortcut feel justified.

Hour 2: Anti-Affinity Blocks Scheduling¶

Sets requiredDuringSchedulingIgnoredDuringExecution anti-affinity with too few nodes. Nobody pushed back because the shortcut looked harmless in the moment. But the result is third replica cannot schedule because all available nodes already have one replica.

Footgun #3: Anti-Affinity Blocks Scheduling — sets requiredDuringSchedulingIgnoredDuringExecution anti-affinity with too few nodes, leading to third replica cannot schedule because all available nodes already have one replica.

Pressure is mounting. The team is behind schedule and cutting more corners.

Hour 3: Init Container Image Pull Failure¶

Init container references a private registry without imagePullSecrets. The team had gotten away with similar shortcuts before, so nobody raised a flag. But the result is pods stuck in Init:ImagePullBackOff; main container never starts; error is easy to miss.

Footgun #4: Init Container Image Pull Failure — init container references a private registry without imagePullSecrets, leading to pods stuck in Init:ImagePullBackOff; main container never starts; error is easy to miss.

By hour 3, the compounding failures have reached critical mass. Pages fire. The war room fills up. The team scrambles to understand what went wrong while the system burns.

The Postmortem¶

Root Cause Chain¶

#	Mistake	Consequence	Could Have Been Prevented By
1	No Resource Requests	Pods land on nodes with insufficient memory; OOMKilled repeatedly; scheduler has no information to make good decisions	Primer: Always set resource requests reflecting actual usage
2	Wrong Node Selector	Pods stay Pending forever with 'no nodes available' events; engineer thinks the cluster is broken	Primer: Verify node labels before setting selectors; use kubectl get nodes --show-labels
3	Anti-Affinity Blocks Scheduling	Third replica cannot schedule because all available nodes already have one replica	Primer: Use preferredDuringScheduling for anti-affinity when node count is limited
4	Init Container Image Pull Failure	Pods stuck in Init:ImagePullBackOff; main container never starts; error is easy to miss	Primer: Verify imagePullSecrets are configured for all private registry images

Damage Report¶

Downtime: 2-4 hours of pod-level or cluster-wide disruption
Data loss: Risk of volume data loss if StatefulSets were affected
Customer impact: Intermittent 5xx errors, dropped connections, or full service outage
Engineering time to remediate: 10-20 engineer-hours for incident response, rollback, and postmortem
Reputation cost: On-call fatigue; delayed feature work; possible SLA breach notification

What the Primer Teaches¶

Footgun #1: If the engineer had read the primer, section on no resource requests, they would have learned: Always set resource requests reflecting actual usage.
Footgun #2: If the engineer had read the primer, section on wrong node selector, they would have learned: Verify node labels before setting selectors; use kubectl get nodes --show-labels.
Footgun #3: If the engineer had read the primer, section on anti-affinity blocks scheduling, they would have learned: Use preferredDuringScheduling for anti-affinity when node count is limited.
Footgun #4: If the engineer had read the primer, section on init container image pull failure, they would have learned: Verify imagePullSecrets are configured for all private registry images.

Cross-References¶

Primer — The right way
Footguns — The mistakes catalogued
Street Ops — How to do it in practice