What Happens When You `kubectl apply`
- lesson
- kubernetes-api-server
- etcd
- scheduler
- kubelet
- container-runtime
- networking
- probes
---# What Happens When You
kubectl apply
Topics: Kubernetes API server, etcd, scheduler, kubelet, container runtime, networking, probes Level: L1–L2 (Foundations → Operations) Time: 60–90 minutes Prerequisites: None (Kubernetes concepts explained as we go)
The Mission¶
You type kubectl apply -f deployment.yaml and press Enter. Kubernetes says:
Sixty seconds later, three pods are running, each with a unique IP, health checks passing, and traffic flowing. In those 60 seconds, at least 7 different components collaborated through a pattern borrowed from industrial control theory. None of them talked to each other directly — they all watched a shared database and reacted to changes.
This lesson follows kubectl apply from YAML to running pod through every component,
explaining what each one does and how they coordinate.
The Architecture in 30 Seconds¶
┌─────────────┐
kubectl ──────→ │ API Server │ ←──── Controllers
└──────┬──────┘ (watch + react)
│
┌─────┴─────┐
│ etcd │ (source of truth)
└───────────┘
↑
┌─────────────────┼──────────────┐
│ │ │
┌────┴─────┐ ┌──────┴──┐ ┌──────┴──┐
│ Scheduler │ │ kubelet │ │ kubelet │ (per-node)
└──────────┘ └────┬────┘ └────┬────┘
│ │
┌────┴────┐ ┌─────┴────┐
│ containerd │ containerd (container runtime)
└─────────┘ └──────────┘
Everything works through the control loop pattern: components watch the API server for changes, compare desired state to actual state, and take action to close the gap. Nobody gives orders — everyone reacts to the shared truth in etcd.
Name Origin: Kubernetes is Greek for "helmsman" or "pilot." The logo is a ship's wheel with 7 spokes (a nod to the 7 original Google founders involved). The project's internal codename was "Seven" — a reference to Seven of Nine from Star Trek, which was itself a reference to the Borg. Google's internal predecessor to Kubernetes was literally called Borg. The naming wasn't subtle.
Trivia: "k8s" is a numeronym — 8 letters between "k" and "s." Same pattern as "i18n" (internationalization) and "l10n" (localization). It was adopted because "kubernetes" is long to type and hard to spell.
Step 1: kubectl Sends YAML to the API Server¶
When you run kubectl apply -f deployment.yaml, kubectl:
- Reads the YAML file
- Converts it to JSON (the API server speaks JSON, not YAML)
- Sends an HTTP request to the API server
# See what kubectl actually sends (dry-run, no changes)
kubectl apply -f deployment.yaml --dry-run=server -o yaml
# See the raw HTTP request
kubectl apply -f deployment.yaml -v=8
# → I0322 14:23:01 round_trippers.go:463]
# POST https://api-server:6443/apis/apps/v1/namespaces/default/deployments
# Request Body: {"apiVersion":"apps/v1","kind":"Deployment",...}
Under the Hood: Kubernetes chose YAML for human-readability and comment support. JSON doesn't support comments, which makes it terrible for configuration that humans edit. But the API server internally works entirely with JSON — your YAML is converted at the kubectl layer before it hits the wire.
Step 2: The API Server Validates and Stores¶
The API server is the only component that talks to etcd. Everything else goes through it.
When the Deployment arrives:
Request arrives
→ Authentication: who is this? (certificates, tokens, OIDC)
→ Authorization: can they do this? (RBAC check)
→ Admission control:
→ Mutating webhooks: modify the request (inject sidecars, add labels)
→ Validating webhooks: reject bad requests (policy enforcement)
→ Schema validation: does the YAML match the Deployment spec?
→ Write to etcd: store the desired state
→ Return 201 Created to kubectl
Name Origin: etcd stands for "distributed
/etc" — the/etcdirectory where Unix stores configuration files, plus "d" for distributed. It's a distributed key-value store that provides strong consistency (Raft consensus). Every Kubernetes object — every Pod, Service, ConfigMap, Secret — lives in etcd.
At this point, nothing has happened yet. No pod is running. The Deployment object exists in etcd as desired state. The system needs to make reality match.
Step 3: The Deployment Controller Creates a ReplicaSet¶
The Deployment controller is one of many controllers running in the
kube-controller-manager. It watches the API server for Deployment objects.
Deployment controller sees: new Deployment "myapp" with replicas: 3
Deployment controller creates: ReplicaSet "myapp-7d8f9c4b5f" with replicas: 3
Why a ReplicaSet and not Pods directly? Because the Deployment manages rolling updates. When you change the image tag, the Deployment creates a new ReplicaSet and scales it up while scaling the old one down. The ReplicaSet is the unit of "this exact version with this exact config."
Step 4: The ReplicaSet Controller Creates Pods¶
The ReplicaSet controller watches ReplicaSets. It sees a new one with replicas: 3 and
0 matching pods. It creates 3 Pod objects in etcd.
ReplicaSet controller sees: ReplicaSet wants 3 pods, has 0
ReplicaSet controller creates: Pod myapp-7d8f9c4b5f-abc12
Pod myapp-7d8f9c4b5f-def34
Pod myapp-7d8f9c4b5f-ghi56
These Pod objects exist in etcd with spec.nodeName: "" — they haven't been assigned to
a node yet. They're in Pending state.
Mental Model: Think of Kubernetes like a hiring pipeline. The Deployment is the job posting ("we need 3 engineers"). The ReplicaSet is a specific batch of hires ("these 3, with this exact job description"). The Pods are individual hires. The Scheduler is HR assigning them to offices. The kubelet is the office manager making sure they show up.
Step 5: The Scheduler Assigns Nodes¶
The scheduler watches for Pods with no nodeName. For each unassigned Pod, it runs a
two-phase algorithm:
Phase 1 — Filtering: Which nodes can run this Pod?
- Does the node have enough CPU and memory (based on
requests)? - Does it match any
nodeSelectorornodeAffinityrules? - Do any taints on the node prevent this Pod (unless the Pod tolerates them)?
- Is the node healthy and schedulable?
Phase 2 — Scoring: Of the eligible nodes, which is best?
- Spread pods across zones/nodes (topology spreading)
- Prefer nodes with less resource pressure
- Co-locate or separate from specific pods (affinity/anti-affinity)
- Prefer nodes that already have the image cached
The scheduler writes spec.nodeName on the Pod. This is the only thing it does — it
doesn't start anything.
# See scheduling decisions
kubectl get events --sort-by=.metadata.creationTimestamp
# → Successfully assigned default/myapp-7d8f9c4b5f-abc12 to node-2
# Why is a pod stuck in Pending?
kubectl describe pod myapp-xxx
# → Events:
# → Warning FailedScheduling 0/3 nodes are available:
# → 3 Insufficient memory
Step 6: The Kubelet Starts the Container¶
The kubelet runs on every node. It watches the API server for Pods assigned to its node.
When it sees a new Pod for its node:
- Pull the image (if not cached locally)
- Create the pod sandbox — a network namespace shared by all containers in the pod
- Call the CNI plugin — sets up networking (veth pair, IP address, routes)
- Mount volumes — PVCs, ConfigMaps, Secrets, emptyDir
- Run init containers — sequential, each must complete before the next starts
- Run app containers — parallel by default
- Start health probes — readiness and liveness checks
# Watch the kubelet's progress
kubectl get events -w
# → Pulling image "myapp:v1"
# → Successfully pulled image "myapp:v1" in 3.2s
# → Created container myapp
# → Started container myapp
The kubelet doesn't directly create containers. It calls the Container Runtime Interface
(CRI) — usually containerd — which calls the low-level runtime (runc) to create the
namespaced, cgroup-limited process.
Under the Hood:
runcis the OCI-compliant runtime that actually creates the container. It callsclone()with namespace flags, sets up cgroups, mounts the filesystem (OverlayFS), drops capabilities, applies seccomp filters, then callsexecve()with your entrypoint. It does this in about 100 milliseconds.
Step 7: Networking — The Pod Gets an IP¶
The CNI (Container Network Interface) plugin gives the Pod its own IP address. Unlike Docker's port-mapping model, Kubernetes mandates flat networking: every Pod gets a routable IP, and all Pods can reach each other without NAT.
# See pod IPs
kubectl get pods -o wide
# → NAME IP NODE
# → myapp-7d8f9c4b5f-abc12 10.244.1.15 node-2
# → myapp-7d8f9c4b5f-def34 10.244.2.23 node-3
# → myapp-7d8f9c4b5f-ghi56 10.244.1.16 node-2
The Service provides a stable IP that load-balances across these Pod IPs:
kubectl get svc myapp
# → NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S)
# → myapp ClusterIP 10.96.45.12 <none> 80/TCP
This ClusterIP (10.96.45.12) is virtual — nothing actually listens on it. Instead,
kube-proxy programs iptables (or IPVS/eBPF) rules on every node that intercept packets
to 10.96.45.12 and DNAT them to one of the Pod IPs.
Step 8: Health Checks Gate Traffic¶
The Pod is running, but is it ready? Kubernetes has three types of probes:
| Probe | Purpose | What happens on failure |
|---|---|---|
| Startup | "Has the app finished initializing?" | Keep checking (don't run other probes yet) |
| Readiness | "Can the app handle requests?" | Remove from Service endpoints (no traffic) |
| Liveness | "Is the app still alive?" | Restart the container |
readinessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 15
periodSeconds: 20
Until the readiness probe passes, the Pod exists but receives no traffic. This is how zero-downtime deployments work: new Pods must prove they're healthy before old Pods are removed.
Gotcha: A liveness probe that checks the database ("am I healthy?" → "can I query the DB?") will restart your pod when the database is slow. The pod is fine — it's the database that's struggling. Now you have pod restarts + database load from reconnections. Liveness probes should check "is this process fundamentally stuck?" not "are my dependencies healthy."
The Complete Flow — One Picture¶
[1] kubectl apply -f deployment.yaml
→ YAML → JSON → HTTP POST to API server
[2] API server validates
→ AuthN → AuthZ (RBAC) → Admission webhooks → Schema validation
→ Store Deployment in etcd
[3] Deployment controller (watches Deployments)
→ Creates ReplicaSet
[4] ReplicaSet controller (watches ReplicaSets)
→ Creates 3 Pods (Pending, no node assigned)
[5] Scheduler (watches unassigned Pods)
→ Filter nodes → Score nodes → Assign Pod to node (write spec.nodeName)
[6] Kubelet on assigned node (watches Pods for its node)
→ Pull image → Create sandbox → CNI networking → Mount volumes
→ Run init containers → Run app containers
[7] Kube-proxy (watches Services + Endpoints)
→ Programs iptables/IPVS rules for Service → Pod routing
[8] Readiness probe passes
→ Pod added to Endpoints → Traffic flows
Time from kubectl apply to traffic flowing: typically 30-90 seconds (dominated by image
pull and readiness probe initial delay).
Rolling Updates: What Happens When You Change the Image¶
The Deployment controller:
- Creates a new ReplicaSet (for v2)
- Scales up the new RS by 1 (now: 3 old + 1 new)
- Waits for the new Pod's readiness probe to pass
- Scales down the old RS by 1 (now: 2 old + 1 new)
- Repeats until all replicas are new (0 old + 3 new)
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # How many extra pods during update (1 = 4 total at peak)
maxUnavailable: 0 # How many can be down (0 = no downtime)
# Watch a rolling update
kubectl rollout status deployment/myapp
# → Waiting for deployment "myapp" rollout to finish: 1 out of 3 new replicas updated...
# Undo if something goes wrong
kubectl rollout undo deployment/myapp
Flashcard Check¶
Q1: Where does the Deployment object live after kubectl apply?
In etcd, accessed through the API server. The API server is the only component that talks to etcd directly.
Q2: What does the scheduler actually do?
It watches for Pods with no
nodeName, runs filtering (which nodes can?) and scoring (which is best?), and writesspec.nodeName. It doesn't start anything.
Q3: Readiness probe fails. What happens?
The Pod is removed from Service endpoints. No traffic is routed to it. The Pod keeps running — it's not restarted (that's the liveness probe's job).
Q4: Why does a Deployment create a ReplicaSet instead of Pods directly?
Because rolling updates need two sets of Pods simultaneously (old version + new version). The ReplicaSet is the unit of "this exact version."
Q5: kube-proxy manages the ClusterIP. Is there a process listening on that IP?
No. The ClusterIP is virtual. kube-proxy programs iptables/IPVS rules that intercept packets and DNAT them to real Pod IPs.
Q6: Pod is stuck in Pending. What should you check first?
kubectl describe pod— look at Events for scheduling failures. Common: insufficient CPU/memory, no matching nodes for nodeSelector, untoleratable taints.
Exercises¶
Exercise 1: Watch the chain (hands-on)¶
# In one terminal, watch events
kubectl get events -w
# In another terminal, create a deployment
kubectl create deployment test --image=nginx --replicas=2
# Watch the events and identify each step:
# - Deployment created
# - ReplicaSet created
# - Pods created (Pending)
# - Pods scheduled
# - Image pulled
# - Containers started
# Clean up
kubectl delete deployment test
Exercise 2: Break the scheduler (hands-on)¶
Create a Pod that can't be scheduled:
apiVersion: v1
kind: Pod
metadata:
name: unschedulable
spec:
containers:
- name: app
image: nginx
resources:
requests:
memory: "999Gi" # More than any node has
Apply it, check kubectl describe pod unschedulable, see the scheduling failure event.
Then delete it.
Exercise 3: The decision (think)¶
A kubectl apply succeeded but the Pod is stuck in various states. What's wrong?
- Pod status:
Pendingfor 5 minutes, no events about scheduling - Pod status:
ContainerCreatingfor 3 minutes - Pod status:
Runningbut0/1 Ready - Pod status:
CrashLoopBackOff - Pod status:
ImagePullBackOff
Answers
1. **Scheduler can't find a node.** Check `kubectl describe pod` for scheduling events. Usually: insufficient resources, unsatisfiable nodeSelector, or all nodes tainted. 2. **Image pull or volume mount is slow/failing.** Check events for image pull progress. Could also be a volume (PVC) that can't bind, or a CNI plugin issue. 3. **Readiness probe failing.** The container is running but not passing its health check. Check `kubectl logs pod` and `kubectl describe pod` for probe failure events. 4. **App crashes immediately after starting.** Check `kubectl logs pod --previous` for the crash output. Common: missing env var, wrong config path, port conflict. 5. **Can't pull the image.** Wrong image name, auth failure (imagePullSecrets), or registry unreachable. Check `kubectl describe pod` for the pull error details.Cheat Sheet¶
Debugging Flow¶
| Symptom | Command | What to look for |
|---|---|---|
| Pending | kubectl describe pod |
Scheduling events |
| ContainerCreating | kubectl describe pod |
Image pull, volume mount |
| CrashLoopBackOff | kubectl logs --previous |
App crash output |
| Running but not Ready | kubectl describe pod |
Probe failure events |
| ImagePullBackOff | kubectl describe pod |
Registry auth, image name |
Useful Commands¶
| Task | Command |
|---|---|
| Watch events | kubectl get events -w --sort-by=.metadata.creationTimestamp |
| Rollout status | kubectl rollout status deployment/NAME |
| Rollback | kubectl rollout undo deployment/NAME |
| Force re-schedule | kubectl delete pod NAME (controller recreates) |
| Check scheduler | kubectl describe pod NAME \| grep -A5 Events |
Takeaways¶
-
Everything is a control loop. No component gives orders. They watch etcd (through the API server) and react to changes. Desired state in, actual state converges.
-
The API server is the only door to etcd. Every
kubectlcommand, every controller, every kubelet goes through the API server. It handles auth, authorization, validation, and admission control. -
The scheduler only assigns — it doesn't start. It picks a node and writes
spec.nodeName. The kubelet on that node does the actual work. -
Readiness probes gate traffic. Until a Pod passes readiness, it exists but receives nothing. This is how zero-downtime deploys work.
-
Rolling updates use two ReplicaSets. Old and new versions coexist briefly. The Deployment controller orchestrates the gradual swap.
Related Lessons¶
- Connection Refused — what goes wrong at the Kubernetes Service layer
- Out of Memory — Kubernetes resource limits and the OOM killer
- The Hanging Deploy — process lifecycle inside containers
Pages that link here¶
- Container Registries Where Your Images Actually Live
- Cross-Domain Lessons
- Docker Compose The Local Cluster
- Envoy The Proxy Thats Everywhere
- Gitops The Repo Is The Truth
- Kubernetes Debugging When Pods Wont Behave
- Kubernetes From Scratch To Production Upgrade
- Kubernetes Node Lifecycle From Provision To Decommission
- Kubernetes Services How Traffic Finds Your Pod