Kubernetes Ops — Trivia & Interesting Facts¶

Surprising, historical, and little-known facts about Kubernetes operations.

etcd backup is the most important ops task and the most commonly skipped¶

Losing etcd data means losing the entire cluster state — every deployment, service, secret, configmap, and custom resource. Despite this, a 2022 survey by Spectro Cloud found that nearly 40% of self-managed Kubernetes clusters had no automated etcd backup. Managed services (EKS, GKE, AKS) handle etcd backup transparently, which is arguably the strongest reason to use managed Kubernetes.

kubectl apply stores the last-applied configuration as an annotation¶

Every time you kubectl apply a manifest, the full JSON representation is stored as the kubectl.kubernetes.io/last-applied-configuration annotation on the resource. This annotation is how kubectl determines three-way merge behavior (what you had, what you want, what exists). For large resources, this annotation can add 10-50 KB of metadata to every object, and it is visible to anyone with read access to the resource.

Namespace deletion can hang forever due to finalizers¶

Deleting a namespace triggers deletion of all resources within it. If any resource has a finalizer that cannot be resolved (broken operator, external dependency), the namespace enters a Terminating state and stays there indefinitely. The standard fix involves manually patching the namespace's finalizer list via the /finalize API endpoint — a command so commonly needed that it appears in every Kubernetes admin's notes.

Resource requests are for scheduling; limits are for enforcement — and they serve different masters¶

Requests tell the scheduler how much capacity to reserve. Limits tell the kernel (via cgroups) when to throttle or kill. Setting requests equal to limits (Guaranteed QoS) wastes resources but provides predictability. Setting requests much lower than limits (Burstable QoS) improves utilization but enables noisy-neighbor problems. Getting this ratio right is arguably the hardest resource management challenge in Kubernetes.

The Kubernetes API server is stateless — etcd is the only source of truth¶

The API server itself stores nothing. Every API server instance is identical and interchangeable. You can have one, three, or ten API servers behind a load balancer, and they all read from and write to the same etcd cluster. This statelessness makes the API server horizontally scalable and easy to replace — but it means etcd performance directly determines API server responsiveness.

Kubernetes upgrades skip versions at your peril¶

Kubernetes supports upgrading one minor version at a time (e.g., 1.27 to 1.28, never 1.27 to 1.29 directly). Skipping versions can cause API incompatibilities, stored resource version mismatches, and control plane components that cannot negotiate protocols. Each upgrade requires running kubeadm upgrade plan (or the managed equivalent), updating control plane components, then rolling worker nodes. The entire process for a 100-node cluster typically takes 2-4 hours.

Secret encryption at rest is not enabled by default¶

Kubernetes Secrets are stored in etcd as base64-encoded plaintext by default — not encrypted. Anyone with direct etcd access can read every secret. Encryption at rest requires configuring an EncryptionConfiguration on the API server, which adds AES-CBC or AES-GCM encryption for Secret resources. Managed Kubernetes services typically enable this by default, but self-managed clusters must configure it explicitly.

The kubelet garbage collector deletes old containers and images automatically¶

Kubelet runs automatic garbage collection: containers from terminated pods are removed after 1 minute (minimum), and unused images are deleted when disk usage exceeds 85% (high threshold). If disk reaches 100%, kubelet starts evicting pods. These thresholds are configurable, and understanding them prevents the "where did my debug container go" mystery and the "node disk full, all pods evicted" emergency.

Rolling updates have a surge and unavailability budget that most people leave at defaults¶

The default Deployment rolling update strategy allows 25% maxSurge (extra pods above desired count) and 25% maxUnavailable (pods that can be down simultaneously). For a 4-replica deployment, this means 1 extra pod and 1 unavailable pod during rollout. Teams running latency-sensitive workloads often set maxSurge to 50% and maxUnavailable to 0, ensuring full capacity throughout the rollout at the cost of temporarily higher resource usage.

kubectl diff shows what would change before you apply¶

kubectl diff -f manifest.yaml performs a server-side dry run and shows the exact diff between the live resource and the proposed change. This is invaluable for reviewing changes before applying, especially for complex resources where a YAML diff against the local file would miss defaulted fields. Despite its usefulness, many operators learn about kubectl diff years into their Kubernetes careers.

Priority classes determine which pods survive under resource pressure¶

PriorityClasses assign integer priorities (0-1,000,000,000) to pods. When a node is under resource pressure, lower-priority pods are preempted (evicted) to make room for higher-priority pods. The system-critical priority classes (system-cluster-critical and system-node-critical) are reserved for control plane components. Without proper PriorityClasses, a monitoring DaemonSet can be evicted to make room for a batch job — leaving you blind during an incident.

The Kubernetes API deprecation policy gives you 12+ months of warning¶

When an API version is deprecated, it remains functional for at least 12 months or 3 Kubernetes releases (whichever is longer). The kubectl convert plugin and tools like pluto detect deprecated API versions in manifests. Despite this generous warning period, API deprecations (like the batch/v1beta1 to batch/v1 migration) still catch teams off guard because their Helm charts, operators, or GitOps manifests reference old versions.

HPA Trivia¶

HPA v1 only supported CPU — custom metrics took 4 years to reach GA¶

The original HPA (autoscaling/v1) could only scale on CPU utilization. Custom metrics support did not reach GA (autoscaling/v2) until Kubernetes 1.23 in December 2021.

HPA uses a ratio formula, not thresholds¶

The HPA does not use "scale up when CPU > 80%" logic. It uses: desiredReplicas = ceil(currentReplicas * (currentMetricValue / targetMetricValue)). This proportional approach means HPA can scale multiple replicas at once.

The 10% tolerance band prevents flapping¶

HPA has a built-in tolerance of 0.1 (10%). If the calculated ratio is within 10% of 1.0, no scaling action is taken. Configurable via --horizontal-pod-autoscaler-tolerance.

KEDA extends HPA to scale on 60+ event sources including scale-to-zero¶

KEDA (Kubernetes Event-Driven Autoscaling), a CNCF project, extends HPA to scale on Kafka consumer lag, RabbitMQ queue depth, Prometheus queries, and more. KEDA can also scale to zero replicas — something standard HPA deliberately does not support.

Probe Trivia¶

Startup probes were added because initialDelaySeconds was a terrible workaround¶

Before Kubernetes 1.18 (March 2020), the only way to handle slow-starting containers was initialDelaySeconds on liveness. If set to 120 seconds for worst-case startup, you also lose 120 seconds of failure detection during normal operation.

exec probes fork a process inside the container at a non-trivial cost¶

An exec probe runs a command by forking a new process. If it runs every 10 seconds and takes 2 seconds, that is 20% overhead in process creation. HTTP and TCP probes are performed by kubelet externally with virtually zero container overhead.

Probes run from kubelet, not from the network — bypassing service mesh rules¶

All probe requests originate from kubelet, hitting the pod's IP directly. They do not pass through kube-proxy, service mesh sidecars, or NetworkPolicies. mTLS enforcement does not apply to probes.

Container restart counts never reset unless the pod is deleted¶

Every liveness probe failure increments the restart count, visible in kubectl get pods. This counter persists for the pod's lifetime. A pod showing 47 restarts might have been stable for days, but the number looks alarming.