Skip to content

Answer Key: The Pods That Won't Schedule

The System

An e-commerce marketplace with a catalog service that needs to scale with traffic:

[Traffic] --> [catalog-service (3/6 desired replicas)]
                    |
              HPA: target 70% CPU, current 87%
              desired: 6 replicas, actual: 3
                    |
              ResourceQuota: compute-quota
              CPU requests: 1.5/2 (75% used)
              Memory requests: 1.5Gi/2Gi (75% used)
                    |
              [Nodes: 4 nodes, 22-34% utilized]
              Cluster has 16 allocatable CPU cores
              Cluster autoscaler: "no unschedulable pods"

The HPA sees high CPU and wants to scale out. The namespace quota blocks new pod creation. The cluster autoscaler does not see the problem because pods are rejected before they enter the scheduling queue.

What's Broken

Root cause: The namespace ResourceQuota (compute-quota) limits CPU requests to 2 cores total. Each catalog-service pod requests 500m. With 3 pods running (1.5 CPU used), only 500m remains — enough for 1 more pod. The HPA wants 6 replicas (3 CPU needed), which exceeds the 2 CPU quota.

Why this is insidious: 1. Pods are never created — the quota admission controller rejects the pod creation request at the API server level, before the scheduler even sees it 2. No Pending pods — the cluster autoscaler triggers on Pending pods with unschedulable conditions, but here there are no Pending pods 3. Node resources look plentiful — 22-34% CPU utilization across 4 nodes; naive triage says "we have capacity" 4. The HPA keeps trying — it correctly calculates the need for 6 replicas and keeps attempting to scale, generating repeated "exceeded quota" events 5. The service is degraded — 87% CPU on 3 pods means requests are slow, but nothing is crashing

Key clue: The event message "exceeded quota: compute-quota, requested: cpu=500m, used: cpu=1500m, limited: cpu=2" gives the exact diagnosis. But you need to know that quota rejection prevents pod creation entirely, not just scheduling.

The Fix

Immediate (increase the quota)

kubectl patch resourcequota compute-quota -n marketplace \
  --type='merge' -p '{
    "spec": {
      "hard": {
        "requests.cpu": "5",
        "requests.memory": "5Gi",
        "limits.cpu": "10",
        "limits.memory": "10Gi"
      }
    }
  }'

Permanent

  1. Right-size the quota for HPA max replicas:

    # max replicas (10) * per-pod requests (500m CPU, 512Mi memory) + headroom
    apiVersion: v1
    kind: ResourceQuota
    metadata:
      name: compute-quota
      namespace: marketplace
    spec:
      hard:
        requests.cpu: "6"       # 10 pods * 500m + buffer
        requests.memory: 6Gi    # 10 pods * 512Mi + buffer
        limits.cpu: "12"
        limits.memory: 12Gi
        pods: "20"
    

  2. Add quota usage alerting:

    # Prometheus alert rule
    - alert: NamespaceQuotaNearLimit
      expr: |
        kube_resourcequota{type="used"} / kube_resourcequota{type="hard"} > 0.8
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "Namespace {{ $labels.namespace }} quota {{ $labels.resource }} is {{ $value | humanizePercentage }} utilized"
    

  3. Document the relationship between HPA max replicas, per-pod resource requests, and namespace quotas.

Verification

# Check quota has headroom
kubectl describe resourcequota compute-quota -n marketplace

# Verify HPA can scale
kubectl get hpa -n marketplace
# REPLICAS should increase toward 6

# Watch pods scaling up
kubectl get pods -n marketplace -w

# Confirm CPU utilization drops
kubectl top pods -n marketplace

Artifact Decoder

Artifact What It Revealed What Was Misleading
CLI Output Events show "exceeded quota" — the actual error; HPA wants 6 but has 3 kubectl top nodes shows 22-34% utilization — looks like plenty of capacity
Metrics Quota used: 1.5/2 CPU; desired 6 replicas but only 3 running Cluster has 16 allocatable CPU cores — the bottleneck is not cluster capacity
IaC Snippet Quota allows 2 CPU requests; each pod needs 500m; HPA max is 10 (needs 5 CPU) HPA config looks reasonable (70% target, 3-10 range); quota looks reasonable in isolation
Log Lines ReplicaSet controller confirms quota rejection; cluster autoscaler sees nothing "no unschedulable pods found" from cluster autoscaler makes it seem like scaling is not needed

Skills Demonstrated

  • Understanding the difference between ResourceQuota rejection and scheduling failure
  • Recognizing that cluster autoscaler is blind to quota-blocked pods
  • Calculating quota requirements from HPA configuration and pod resource requests
  • Understanding the Kubernetes admission control pipeline (quota check before scheduling)
  • Correlating HPA desired state with actual pod count to identify the gap

Prerequisite Topic Packs