Answer Key: The Pods That Won't Schedule¶

The System¶

An e-commerce marketplace with a catalog service that needs to scale with traffic:

[Traffic] --> [catalog-service (3/6 desired replicas)]
                    |
              HPA: target 70% CPU, current 87%
              desired: 6 replicas, actual: 3
                    |
              ResourceQuota: compute-quota
              CPU requests: 1.5/2 (75% used)
              Memory requests: 1.5Gi/2Gi (75% used)
                    |
              [Nodes: 4 nodes, 22-34% utilized]
              Cluster has 16 allocatable CPU cores
              Cluster autoscaler: "no unschedulable pods"

The HPA sees high CPU and wants to scale out. The namespace quota blocks new pod creation. The cluster autoscaler does not see the problem because pods are rejected before they enter the scheduling queue.

What's Broken¶

Root cause: The namespace ResourceQuota (compute-quota) limits CPU requests to 2 cores total. Each catalog-service pod requests 500m. With 3 pods running (1.5 CPU used), only 500m remains — enough for 1 more pod. The HPA wants 6 replicas (3 CPU needed), which exceeds the 2 CPU quota.

Why this is insidious: 1. Pods are never created — the quota admission controller rejects the pod creation request at the API server level, before the scheduler even sees it 2. No Pending pods — the cluster autoscaler triggers on Pending pods with unschedulable conditions, but here there are no Pending pods 3. Node resources look plentiful — 22-34% CPU utilization across 4 nodes; naive triage says "we have capacity" 4. The HPA keeps trying — it correctly calculates the need for 6 replicas and keeps attempting to scale, generating repeated "exceeded quota" events 5. The service is degraded — 87% CPU on 3 pods means requests are slow, but nothing is crashing

Key clue: The event message "exceeded quota: compute-quota, requested: cpu=500m, used: cpu=1500m, limited: cpu=2" gives the exact diagnosis. But you need to know that quota rejection prevents pod creation entirely, not just scheduling.

The Fix¶

Immediate (increase the quota)¶

kubectl patch resourcequota compute-quota -n marketplace \
  --type='merge' -p '{
    "spec": {
      "hard": {
        "requests.cpu": "5",
        "requests.memory": "5Gi",
        "limits.cpu": "10",
        "limits.memory": "10Gi"
      }
    }
  }'

Permanent¶

Right-size the quota for HPA max replicas:

# max replicas (10) * per-pod requests (500m CPU, 512Mi memory) + headroom
apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-quota
  namespace: marketplace
spec:
  hard:
    requests.cpu: "6"       # 10 pods * 500m + buffer
    requests.memory: 6Gi    # 10 pods * 512Mi + buffer
    limits.cpu: "12"
    limits.memory: 12Gi
    pods: "20"

Add quota usage alerting:

# Prometheus alert rule
- alert: NamespaceQuotaNearLimit
  expr: |
    kube_resourcequota{type="used"} / kube_resourcequota{type="hard"} > 0.8
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "Namespace {{ $labels.namespace }} quota {{ $labels.resource }} is {{ $value | humanizePercentage }} utilized"

Document the relationship between HPA max replicas, per-pod resource requests, and namespace quotas.

Verification¶

# Check quota has headroom
kubectl describe resourcequota compute-quota -n marketplace

# Verify HPA can scale
kubectl get hpa -n marketplace
# REPLICAS should increase toward 6

# Watch pods scaling up
kubectl get pods -n marketplace -w

# Confirm CPU utilization drops
kubectl top pods -n marketplace

Artifact Decoder¶

Artifact	What It Revealed	What Was Misleading
CLI Output	Events show "exceeded quota" — the actual error; HPA wants 6 but has 3	`kubectl top nodes` shows 22-34% utilization — looks like plenty of capacity
Metrics	Quota used: 1.5/2 CPU; desired 6 replicas but only 3 running	Cluster has 16 allocatable CPU cores — the bottleneck is not cluster capacity
IaC Snippet	Quota allows 2 CPU requests; each pod needs 500m; HPA max is 10 (needs 5 CPU)	HPA config looks reasonable (70% target, 3-10 range); quota looks reasonable in isolation
Log Lines	ReplicaSet controller confirms quota rejection; cluster autoscaler sees nothing	"no unschedulable pods found" from cluster autoscaler makes it seem like scaling is not needed

Skills Demonstrated¶

Understanding the difference between ResourceQuota rejection and scheduling failure
Recognizing that cluster autoscaler is blind to quota-blocked pods
Calculating quota requirements from HPA configuration and pod resource requests
Understanding the Kubernetes admission control pipeline (quota check before scheduling)
Correlating HPA desired state with actual pod count to identify the gap

Answer Key: The Pods That Won't Schedule¶

The System¶

What's Broken¶

The Fix¶

Immediate (increase the quota)¶

Permanent¶

Verification¶

Artifact Decoder¶

Skills Demonstrated¶

Prerequisite Topic Packs¶

Pages that link here¶