Cost Optimization & FinOps - Street-Level Ops¶
Practical cost reduction patterns from production clusters.
Quick Cost Audit¶
# Node count and sizes
kubectl get nodes -o custom-columns='NAME:.metadata.name,TYPE:.metadata.labels.node\.kubernetes\.io/instance-type,ZONE:.metadata.labels.topology\.kubernetes\.io/zone'
# Cluster-wide resource allocation
kubectl describe nodes | grep -E "(Name:|Allocated|requests)"
# Top resource consumers
kubectl top pods -A --sort-by=cpu | head -20
kubectl top pods -A --sort-by=memory | head -20
# Count pods per namespace
kubectl get pods -A --no-headers | awk '{print $1}' | sort | uniq -c | sort -rn
# Find PVCs and their sizes
kubectl get pvc -A -o custom-columns='NAMESPACE:.metadata.namespace,NAME:.metadata.name,SIZE:.spec.resources.requests.storage,STATUS:.status.phase'
One-liner: Find pods requesting CPU but barely using it:
kubectl top pods -A --no-headers | awk '$3+0 < 10 {print $1, $2, "cpu="$3}'— these are right-sizing candidates.Debug clue:
kubectl describe nodes | grep -A5 "Allocated"shows the gap between requested and allocatable. If requests total 90% but actual usage is 30%, you are massively over-provisioned.
Pattern: The Monthly Cost Review¶
Run this checklist monthly:
- Right-sizing: Compare VPA recommendations to current requests
- Orphaned resources: PVCs, Services (LoadBalancer), unused ConfigMaps
- Node utilization: Target 50-70% average CPU
- Spot coverage: What percentage of workloads are on spot?
- Log volume: Check Loki/CloudWatch ingestion rates
- Reserved capacity: Are reserved instances/savings plans still right-sized?
Pattern: Resource Request Guidelines¶
| Workload type | CPU request | Memory request | CPU limit |
|---|---|---|---|
| Web API | p95 usage | p99 usage + 20% | None or 4x request |
| Background worker | p95 usage | p99 usage + 20% | None |
| Database | Dedicated | Dedicated + buffer | Equal to request |
| Batch job | Average usage | Peak usage | None |
Why no CPU limits for most workloads: CPU limits cause throttling even when the node has idle CPU. This increases latency without saving money. Memory limits are essential (OOMKill is better than node instability).
Under the hood: CPU throttling happens via CFS (Completely Fair Scheduler) bandwidth control. Even if the node has 50% idle CPU, a pod at its limit gets throttled. Check
container_cpu_cfs_throttled_seconds_totalin Prometheus to find victims.Default trap: Kubernetes defaults to no resource requests/limits. Without requests, the scheduler cannot bin-pack efficiently, and pods compete freely for CPU during contention — your latency-sensitive API gets starved by a batch job.
Pattern: Namespace Budget Alerts¶
# Prometheus alert: namespace cost exceeding budget
groups:
- name: cost-alerts
rules:
- alert: NamespaceCPUBudgetExceeded
expr: |
sum by (namespace) (kube_pod_container_resource_requests{resource="cpu"}) > 8
for: 1h
labels:
severity: warning
annotations:
summary: "Namespace {{ $labels.namespace }} requesting >8 CPU cores"
- alert: OrphanedPVCs
expr: |
kube_persistentvolumeclaim_status_phase{phase="Bound"} == 1
unless on (persistentvolumeclaim, namespace)
kube_pod_spec_volumes_persistentvolumeclaims_info
for: 24h
labels:
severity: info
annotations:
summary: "PVC {{ $labels.persistentvolumeclaim }} in {{ $labels.namespace }} not mounted for 24h"
War story: A team ran three
m5.2xlargenodes in dev "to match prod." Monthly cost: $2,100. Average CPU usage: 8%. Switching to a singlet3.largespot instance with scale-to-zero after 7 PM saved $1,900/month — enough to fund their observability tooling.
Anti-Pattern: Oversized Dev Environments¶
Dev clusters that mirror production sizing waste money:
# Dev should have:
# - Fewer replicas (1 instead of 3)
# - Smaller resource requests (50% of prod)
# - Smaller PVCs
# - No multi-AZ
# - Spot instances only
# - Scale to 0 after hours
Anti-Pattern: LoadBalancer per Service¶
Each type: LoadBalancer Service creates a cloud load balancer ($15-25/month each).
Fix: Use an Ingress controller (one LB for all services):
# Instead of 10 LoadBalancer Services ($200/month)
# Use 1 Ingress controller + 10 Ingress rules ($20/month)
Scale note: In AWS, each LoadBalancer Service also creates an ENI per subnet and a security group. At scale you hit VPC limits (default 5000 security groups). NLB is cheaper than ALB for pure TCP, but ALB supports path-based routing which further reduces LB count.
Remember: FinOps cost-driver mnemonic: CDRN — Compute, Data transfer, Reserved capacity gaps, NAT gateways. Review all four monthly. Data transfer and NAT are the "invisible" costs that blindside teams who only watch compute.
Gotcha: Daemonsets and Sidecars¶
Daemonsets run on every node. Each sidecar (mesh proxy, log collector) adds overhead to every pod.
# Calculate daemonset overhead
kubectl get ds -A -o custom-columns='NAME:.metadata.name,CPU:.spec.template.spec.containers[*].resources.requests.cpu,MEM:.spec.template.spec.containers[*].resources.requests.memory'
# Multiply by node count for total overhead
Gotcha: NAT Gateway data processing charges are the silent killer in AWS. At $0.045/GB, a cluster pulling 100GB/day of container images through NAT costs $135/month just in data charges. Use VPC endpoints for ECR/S3 to eliminate this.
Gotcha: Orphaned EBS volumes persist after you delete the EC2 instance or PV. At $0.10/GB/month, a forgotten 500GB volume costs $600/year. Run
aws ec2 describe-volumes --filters Name=status,Values=available --query 'Volumes[].{ID:VolumeId,Size:Size,Created:CreateTime}'monthly to find them.
Quick Savings Calculator¶
Monthly node cost: $X
Nodes in cluster: N
Average utilization: U%
Potential savings from right-sizing:
If U < 40%: can likely reduce nodes by 30-40%
If U = 40-60%: well-optimized
If U > 70%: may need more nodes for reliability
Spot savings (for eligible workloads):
Current on-demand cost * eligible_fraction * 0.7
Dev/staging off-hours savings:
Node cost * (14 off-hours / 24 hours) * (5 weekdays / 7 days) = ~42% savings
Quick Reference¶
- Cheatsheet: Finops