Skip to content

Portal | Level: L2: Operations | Topics: HPA / Autoscaling | Domain: Kubernetes

Scenario: HPA Not Scaling Under Load

The Prompt

"We have an HPA configured for our deployment with target CPU at 50%. Load has been high for 15 minutes but the pod count hasn't changed. The HPA shows <unknown>/50% for CPU. Why isn't it scaling?"

Initial Report

Customer Slack message: "Our app is crawling under load. We set up autoscaling weeks ago but it's stuck at 2 pods. The HPA page shows <unknown> for current CPU. Please help ASAP."

Constraints

  • Time pressure: You have 15 minutes before the next escalation. Users are experiencing degraded response times.
  • Limited access: You do not have access to modify kube-system components directly; changes to metrics-server require a ticket to the platform team.

Observable Evidence

  • Dashboard: Response latency p99 has tripled over the past 30 minutes. Pod count remains flat at 2.
  • HPA output: kubectl get hpa shows TARGETS: <unknown>/50%, REPLICAS: 2, MINPODS: 2, MAXPODS: 10.
  • Logs: kubectl top pods returns error: Metrics API not available or shows no data. Metrics-server pod logs may show TLS handshake errors.

Expected Investigation Path

# 1. Check HPA status
kubectl get hpa -n grokdevops
kubectl describe hpa grokdevops -n grokdevops

# 2. Check if metrics-server is running
kubectl get pods -n kube-system | grep metrics-server
kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes

# 3. Check if deployment has resource requests
kubectl get deploy grokdevops -n grokdevops -o jsonpath='{.spec.template.spec.containers[0].resources}'

# 4. Check actual pod CPU usage
kubectl top pods -n grokdevops

Strong Answer

"The <unknown> in the HPA status is the key signal — it means the HPA can't read CPU metrics. There are three common causes: metrics-server isn't installed or isn't healthy, the deployment doesn't have CPU resource requests defined (HPA needs requests to calculate percentage utilization), or metrics haven't propagated yet after a recent install. I'd first verify metrics-server is running and healthy. Then I'd check that the deployment spec includes resources.requests.cpu. Without requests, the HPA has no baseline to calculate percentage against. On k3s specifically, metrics-server might need --kubelet-insecure-tls due to self-signed certs."

Common Traps

  • Blaming the load generator — the issue is metrics, not load
  • Not knowing that HPA requires resource requests — this is a fundamental concept
  • Forgetting metrics-server is a separate component — it's not part of the control plane by default
  • Not mentioning the cooldown period — HPA has stabilization windows (default 5 min for scale-down)
  • Lab: training/interactive/runtime-labs/lab-runtime-02-hpa-live-scaling/
  • Runbook: training/library/runbooks/kubernetes/hpa_not_scaling.md
  • Drills: training/library/drills/kubectl_drills.md — Drill 5 (HPA status), Drill 6 (top pods)

Wiki Navigation