Skip to content

Solution

Triage

  1. Check deployment status:
    kubectl get deployment notification-service -n staging
    kubectl describe deployment notification-service -n staging
    
  2. Check the ReplicaSet events (this is where quota errors appear):
    kubectl get rs -n staging -l app=notification-service
    kubectl describe rs <replicaset-name> -n staging
    
  3. Inspect the ResourceQuota:
    kubectl describe quota -n staging
    
  4. Check if the pod spec has resource requests/limits:
    kubectl get deployment notification-service -n staging -o jsonpath='{.spec.template.spec.containers[*].resources}'
    

Root Cause

The staging namespace has a ResourceQuota that sets hard limits on CPU and memory requests. The existing workloads in the namespace are consuming most of the quota. The new deployment requests 3 replicas with 500m CPU and 512Mi memory each (totaling 1500m CPU and 1536Mi memory). The remaining quota headroom is only 400m CPU and 256Mi memory, which is insufficient for even a single replica.

The ReplicaSet controller attempts to create pods but the admission controller rejects them with: exceeded quota: staging-quota, requested: requests.cpu=500m,requests.memory=512Mi, used: requests.cpu=3600m,requests.memory=7680Mi, limited: requests.cpu=4000m,requests.memory=8Gi.

Fix

Option 1: Increase the quota (if capacity allows):

kubectl patch resourcequota staging-quota -n staging -p '{"spec":{"hard":{"requests.cpu":"6","requests.memory":"12Gi"}}}'

Option 2: Reduce the new deployment's resource requests:

resources:
  requests:
    cpu: 100m      # reduced from 500m
    memory: 128Mi  # reduced from 512Mi
  limits:
    cpu: 500m
    memory: 512Mi

Option 3: Free up quota by scaling down idle workloads:

kubectl scale deployment old-test-app -n staging --replicas=0

Option 4: Clean up completed/evicted pods consuming quota:

kubectl delete pods -n staging --field-selector status.phase=Succeeded
kubectl delete pods -n staging --field-selector status.phase=Failed

After any fix, verify pods are being created:

kubectl get pods -n staging -l app=notification-service -w

Rollback / Safety

  • Increasing quota is safe but ensure the cluster has physical capacity to back it.
  • Reducing resource requests can lead to OOM kills or CPU starvation if the application needs more than requested.
  • Scaling down other workloads in staging should be coordinated with their owners.

Common Traps

  • Looking at the Deployment events instead of the ReplicaSet. Quota rejection events appear on the ReplicaSet, not the Deployment. Many engineers miss this.
  • Forgetting that ResourceQuota requires resource specs. If a quota covers requests.cpu, every container in the namespace must specify requests.cpu. Pods without it are rejected at admission.
  • Not checking for object count quotas. Quotas can also limit count/pods, count/services, etc. The error message will tell you which resource is exceeded.
  • Assuming quota only counts Running pods. Completed and Failed pods still count toward quota until they are deleted.
  • Editing the live quota without updating IaC. The next Helm deploy or GitOps sync will revert your change.