Skip to content

Portal | Level: L2: Operations | Topics: HPA / Autoscaling | Domain: Kubernetes

Runbook: HPA Not Scaling

Symptoms

  • HPA shows <unknown>/50% for CPU
  • Pod count stays at minReplicas despite high load
  • kubectl describe hpa shows "unable to get metrics"

Fast Triage

kubectl get hpa -n grokdevops
kubectl describe hpa grokdevops -n grokdevops
kubectl top pods -n grokdevops
kubectl get --raw /apis/metrics.k8s.io/v1beta1/namespaces/grokdevops/pods

Likely Causes (ranked)

  1. metrics-server not installed — HPA needs metrics API
  2. No resource requests defined — HPA percentage-based scaling requires CPU requests
  3. metrics-server not ready — recently installed, needs ~60s
  4. Wrong HPA target — HPA points to wrong deployment name

Evidence Interpretation

What bad looks like:

NAME         REFERENCE               TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
grokdevops   Deployment/grokdevops   <unknown>/50%   1         5         1          10m
- <unknown> in TARGETS means the HPA cannot read metrics — usually metrics-server is missing or the Deployment has no CPU requests defined. - kubectl describe hpa will show a Conditions section with messages like "unable to get metrics for resource cpu" or "missing request for cpu". - Once metrics flow, the percentage shown is (actual CPU usage / CPU request) * 100 — it is relative to the request, not to node capacity.

Fix Steps

  1. Install metrics-server if missing:
    kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
    # For k3s with self-signed certs:
    kubectl patch deployment metrics-server -n kube-system --type=json \
      -p='[{"op":"add","path":"/spec/template/spec/containers/0/args/-","value":"--kubelet-insecure-tls"}]'
    
  2. Ensure deployment has CPU requests:
    kubectl get deploy grokdevops -n grokdevops \
      -o jsonpath='{.spec.template.spec.containers[0].resources.requests.cpu}'
    
  3. Wait 60s for metrics to propagate, then check again:
    kubectl get hpa -n grokdevops
    

Verification

kubectl get hpa -n grokdevops  # TARGETS should show actual%/target%
kubectl top pods -n grokdevops  # should return CPU/memory values

Cleanup

None needed.

Unknown Unknowns

  • HPA percentage = (actual CPU / requested CPU) * 100. If you request 100m and use 80m, HPA sees 80% — not a percentage of node CPU.
  • Metrics-server scrapes kubelets every 15 seconds; there is always a delay before HPA sees current load.
  • Scale-down cooldown is 5 minutes by default (--horizontal-pod-autoscaler-downscale-stabilization); don't assume scaling down is broken if it feels slow.
  • Custom metrics (e.g., request rate) require a metrics adapter like Prometheus Adapter — the built-in HPA only handles CPU and memory natively.

[!WARNING] HPA requires metrics-server, but a missing or broken metrics-server produces no error pods and no obvious failure — the HPA simply shows <unknown> in TARGETS and silently does nothing. Always verify metrics-server is running and healthy before debugging HPA logic.

Pitfalls

  • Forgetting to install metrics-server — HPA silently fails with <unknown> instead of an error pod.
  • Not defining resource requests — without a CPU request, HPA has no denominator for percentage calculation and cannot function.
  • Expecting instant scaling — HPA evaluates every 15s, metrics lag by 15s, and pods take time to start. Budget at least 60s before seeing new replicas.

See Also

  • training/library/guides/observability.md (HPA section)
  • training/interactive/runtime-labs/lab-runtime-02-hpa-live-scaling/
  • training/interview-scenarios/02-hpa-not-scaling.md
  • training/interactive/incidents/scenarios/hpa-not-scaling.sh

Wiki Navigation