Node Maintenance - Street-Level Ops¶
Real-world workflows for cordoning, draining, patching, and upgrading Kubernetes nodes safely.
Pre-Flight Checks¶
# See node status and versions
kubectl get nodes -o wide
# NAME STATUS VERSION INTERNAL-IP OS-IMAGE KERNEL-VERSION
# worker-01 Ready v1.28.3 10.0.1.21 Ubuntu 22.04.3 LTS 5.15.0-91
# worker-02 Ready v1.28.3 10.0.1.22 Ubuntu 22.04.3 LTS 5.15.0-91
# worker-03 Ready v1.28.3 10.0.1.23 Ubuntu 22.04.3 LTS 5.15.0-91
# Check PDB headroom before starting
kubectl get pdb -A
# NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
# api-pdb 2 N/A 1 30d
# If ALLOWED DISRUPTIONS is 0, the drain will HANG (not fail) until headroom opens
# Check what pods are on the target node
kubectl get pods -A --field-selector spec.nodeName=worker-03 -o wide
Cordon, Drain, Uncordon¶
# Step 1: Mark node unschedulable (no new pods)
kubectl cordon worker-03
# node/worker-03 cordoned
kubectl get node worker-03
# worker-03 Ready,SchedulingDisabled <none> 45d v1.28.3
# Step 2: Dry-run the drain first
kubectl drain worker-03 --ignore-daemonsets --delete-emptydir-data --dry-run=client
# pod/myapp-abc123 would be evicted
# pod/worker-xyz789 would be evicted
# Step 3: Execute the drain
kubectl drain worker-03 \
--ignore-daemonsets \
--delete-emptydir-data \
--grace-period=120 \
--timeout=300s
# Step 4: Perform maintenance (SSH to node, patch, reboot, etc.)
# Step 5: Uncordon
kubectl uncordon worker-03
# node/worker-03 uncordoned
# Verify pods are rescheduling
kubectl get pods -A -o wide | grep worker-03
Remember: Node maintenance mnemonic: C-D-M-U — Cordon (stop scheduling), Drain (evict pods), Maintain (patch/reboot), Uncordon (resume scheduling). Never skip cordon — draining without it risks new pods landing on the node mid-maintenance.
OS Patching¶
# After cordon and drain, SSH to the node
ssh worker-03
# Apply OS updates
apt-get update && apt-get upgrade -y
# If kernel was updated, reboot
reboot
# Wait for node to rejoin cluster (from control plane)
kubectl get nodes -w
# worker-03 NotReady ...
# worker-03 Ready ...
# Uncordon
kubectl uncordon worker-03
Kubelet Upgrade¶
# On the node (after cordon + drain):
apt-get update
apt-get install -y kubelet=1.29.0-1.1 kubectl=1.29.0-1.1
apt-mark hold kubelet kubectl
systemctl daemon-reload
systemctl restart kubelet
# Verify
systemctl status kubelet
journalctl -u kubelet --no-pager -n 20
# From control plane: check version
kubectl get node worker-03
# worker-03 Ready v1.29.0
kubectl uncordon worker-03
Fix Stuck Drains¶
# Find what is blocking the drain
kubectl get pods -A --field-selector spec.nodeName=worker-03
# Check for standalone pods (no controller — drain refuses to evict these)
kubectl get pods -A --field-selector spec.nodeName=worker-03 -o json | \
jq '.items[] | select(.metadata.ownerReferences == null) | .metadata.namespace + "/" + .metadata.name'
# Force evict standalone pods (--force bypasses the safety check for controller-less pods)
kubectl drain worker-03 --ignore-daemonsets --delete-emptydir-data --force
# If a pod is stuck terminating (finalizer or termination grace period)
kubectl delete pod stuck-pod -n production --grace-period=0 --force
# Check for finalizers blocking deletion
kubectl get pod stuck-pod -n production -o jsonpath='{.metadata.finalizers}'
# Remove stuck finalizer (last resort)
kubectl patch pod stuck-pod -n production -p '{"metadata":{"finalizers":null}}'
Gotcha:
--grace-period=0 --forceon a pod does NOT kill the container immediately — it removes the pod from the API server. If the kubelet on that node is unreachable, the container keeps running. You will have a ghost container consuming resources until the node comes back. Verify withdocker psorcrictl pson the node after it recovers.
Rolling Maintenance Script¶
#!/usr/bin/env bash
set -euo pipefail
NODES=$(kubectl get nodes -l role=worker -o jsonpath='{.items[*].metadata.name}')
for node in ${NODES}; do
echo "=== Maintaining ${node} ==="
# Pre-flight: check PDB headroom
DISRUPTIONS=$(kubectl get pdb -A -o jsonpath='{range .items[*]}{.status.disruptionsAllowed}{" "}{end}')
echo "PDB headroom: ${DISRUPTIONS}"
kubectl cordon "${node}"
kubectl drain "${node}" --ignore-daemonsets --delete-emptydir-data --timeout=300s
ssh "${node}" 'apt-get update && apt-get upgrade -y && reboot' || true
echo "Waiting for ${node} to rejoin..."
until kubectl get node "${node}" 2>/dev/null | grep -q " Ready"; do
sleep 10
done
kubectl uncordon "${node}"
echo "=== ${node} complete ==="
# Wait for pods to reschedule before next node
sleep 60
done
echo "All nodes maintained."
Control Plane Node Maintenance¶
# For control plane nodes with etcd, extra care needed
# Check etcd cluster health first
ETCDCTL_API=3 etcdctl endpoint health --cluster \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key
# Never drain more than one control plane node at a time
# Ensure etcd has quorum (2/3 or 3/5) before proceeding
Under the hood: etcd requires a strict majority for quorum: 2/3, 3/5, or 4/7 nodes. Losing quorum makes the entire cluster read-only — no new pods, no deployments, no config changes. With a 3-node control plane, losing one node is survivable; losing two is a cluster-down event. Always verify
endpoint health --clusterbefore touching a control plane node.
Verify After Maintenance¶
# Check node is Ready and schedulable
kubectl get node worker-03
# Check node conditions
kubectl describe node worker-03 | grep -A10 "Conditions:"
# MemoryPressure False
# DiskPressure False
# PIDPressure False
# Ready True
# Verify DaemonSet pods are back
kubectl get pods -A --field-selector spec.nodeName=worker-03 | grep -i daemon
# Check cluster-wide pod health
kubectl get pods -A | grep -v Running | grep -v Completed