Portal | Level: L2: Operations | Topics: HPA / Autoscaling | Domain: Kubernetes
Scenario: HPA Not Scaling Under Load¶
The Prompt¶
"We have an HPA configured for our deployment with target CPU at 50%. Load has been high for 15 minutes but the pod count hasn't changed. The HPA shows
<unknown>/50%for CPU. Why isn't it scaling?"
Initial Report¶
Customer Slack message: "Our app is crawling under load. We set up autoscaling weeks ago but it's stuck at 2 pods. The HPA page shows
<unknown>for current CPU. Please help ASAP."
Constraints¶
- Time pressure: You have 15 minutes before the next escalation. Users are experiencing degraded response times.
- Limited access: You do not have access to modify kube-system components directly; changes to metrics-server require a ticket to the platform team.
Observable Evidence¶
- Dashboard: Response latency p99 has tripled over the past 30 minutes. Pod count remains flat at 2.
- HPA output:
kubectl get hpashowsTARGETS: <unknown>/50%,REPLICAS: 2,MINPODS: 2,MAXPODS: 10. - Logs:
kubectl top podsreturnserror: Metrics API not availableor shows no data. Metrics-server pod logs may show TLS handshake errors.
Expected Investigation Path¶
# 1. Check HPA status
kubectl get hpa -n grokdevops
kubectl describe hpa grokdevops -n grokdevops
# 2. Check if metrics-server is running
kubectl get pods -n kube-system | grep metrics-server
kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes
# 3. Check if deployment has resource requests
kubectl get deploy grokdevops -n grokdevops -o jsonpath='{.spec.template.spec.containers[0].resources}'
# 4. Check actual pod CPU usage
kubectl top pods -n grokdevops
Strong Answer¶
"The <unknown> in the HPA status is the key signal — it means the HPA can't read CPU metrics. There are three common causes: metrics-server isn't installed or isn't healthy, the deployment doesn't have CPU resource requests defined (HPA needs requests to calculate percentage utilization), or metrics haven't propagated yet after a recent install. I'd first verify metrics-server is running and healthy. Then I'd check that the deployment spec includes resources.requests.cpu. Without requests, the HPA has no baseline to calculate percentage against. On k3s specifically, metrics-server might need --kubelet-insecure-tls due to self-signed certs."
Common Traps¶
- Blaming the load generator — the issue is metrics, not load
- Not knowing that HPA requires resource requests — this is a fundamental concept
- Forgetting metrics-server is a separate component — it's not part of the control plane by default
- Not mentioning the cooldown period — HPA has stabilization windows (default 5 min for scale-down)
Practice and Links¶
- Lab:
training/interactive/runtime-labs/lab-runtime-02-hpa-live-scaling/ - Runbook:
training/library/runbooks/kubernetes/hpa_not_scaling.md - Drills:
training/library/drills/kubectl_drills.md— Drill 5 (HPA status), Drill 6 (top pods)
Wiki Navigation¶
Related Content¶
- Incident Simulator (18 scenarios) (CLI) (Exercise Set, L2) — HPA / Autoscaling
- Kubernetes Exercises (Quest Ladder) (CLI) (Exercise Set, L1) — HPA / Autoscaling
- Kubernetes Ops (Production) (Topic Pack, L2) — HPA / Autoscaling
- Lab: HPA Live Scaling (CLI) (Lab, L1) — HPA / Autoscaling
- Runbook: HPA Not Scaling (Runbook, L2) — HPA / Autoscaling
- Runbook: HPA Thrashing (Rapid Scale Up/Down) (Runbook, L2) — HPA / Autoscaling
- Skillcheck: Kubernetes (Assessment, L1) — HPA / Autoscaling