Skip to content

Portal | Level: L2: Operations | Topics: FinOps | Domain: DevOps & Tooling

FinOps & Cost Optimization Drills

Remember: The FinOps cycle: Inform (visibility — who spends what) -> Optimize (right-size, reserved instances, spot) -> Operate (governance, budgets, alerts). You cannot optimize what you cannot measure. Start with tagging and cost allocation before buying reserved instances.

Gotcha: Kubernetes resource requests determine scheduling and cost, not limits. A pod requesting 4 CPU but using 0.5 CPU wastes 3.5 cores of cluster capacity. Right-sizing requests (not limits) is where the real savings are. Use VPA recommendations or metrics like container_cpu_usage_seconds_total vs kube_pod_container_resource_requests to find waste.

Drill 1: Identify Over-Provisioned Pods

Difficulty: Easy

Q: Write a kubectl command to find pods requesting more than 1 CPU or 2Gi memory in the production namespace.

Answer
# CPU > 1 core
kubectl get pods -n production -o json | jq -r '
  .items[] | .spec.containers[] |
  select(.resources.requests.cpu != null) |
  select((.resources.requests.cpu | rtrimstr("m") | tonumber) > 1000 or
         (.resources.requests.cpu | test("^[0-9]+$") and (.resources.requests.cpu | tonumber) > 1)) |
  "\(.name): cpu=\(.resources.requests.cpu)"'

# Simpler: use kubectl-resource-capacity plugin
kubectl resource-capacity -n production --sort cpu.request --pods
For broader analysis, use VPA in recommendation mode:
kubectl describe vpa -n production | grep -A4 "Target"

Drill 2: Set Up VPA Recommendations

Difficulty: Easy

Q: Create a VPA in recommendation-only mode for the api-server Deployment.

Answer
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: api-server-vpa
  namespace: production
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-server
  updatePolicy:
    updateMode: "Off"
  resourcePolicy:
    containerPolicies:
    - containerName: "*"
      minAllowed:
        cpu: 50m
        memory: 64Mi
      maxAllowed:
        cpu: 4
        memory: 8Gi
# Check recommendations after a few hours of data
kubectl describe vpa api-server-vpa -n production
# Target:     cpu=350m, memory=512Mi  ← use these as new requests
`updateMode: "Off"` = recommend only, never modify pods.

Drill 3: ResourceQuota

Difficulty: Easy

Q: Create a ResourceQuota for team-alpha namespace limiting total requests to 10 CPU / 20Gi memory and max 30 pods.

Answer
apiVersion: v1
kind: ResourceQuota
metadata:
  name: team-alpha-quota
  namespace: team-alpha
spec:
  hard:
    requests.cpu: "10"
    requests.memory: "20Gi"
    limits.cpu: "20"
    limits.memory: "40Gi"
    pods: "30"
    persistentvolumeclaims: "15"
    services.loadbalancers: "2"
# Check usage
kubectl describe resourcequota team-alpha-quota -n team-alpha
# Shows: Used / Hard for each resource

Drill 4: LimitRange Defaults

Difficulty: Easy

Q: Create a LimitRange that sets default requests (100m CPU, 128Mi) and limits (500m CPU, 512Mi) for any container that doesn't specify them.

Answer
apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
  namespace: team-alpha
spec:
  limits:
  - type: Container
    default:
      cpu: "500m"
      memory: "512Mi"
    defaultRequest:
      cpu: "100m"
      memory: "128Mi"
    max:
      cpu: "2"
      memory: "4Gi"
    min:
      cpu: "50m"
      memory: "64Mi"
- `default` = applied as limits if none specified - `defaultRequest` = applied as requests if none specified - `max`/`min` = hard bounds even if the user specifies values

Drill 5: Spot Instance Workloads

Difficulty: Medium

Q: Configure a Deployment to prefer spot instances but tolerate being scheduled on on-demand if spot is unavailable. Ensure pods spread across zones.

Answer
apiVersion: apps/v1
kind: Deployment
metadata:
  name: worker
spec:
  replicas: 10
  template:
    spec:
      tolerations:
      - key: karpenter.sh/capacity-type
        operator: Equal
        value: spot
        effect: NoSchedule
      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 80
            preference:
              matchExpressions:
              - key: karpenter.sh/capacity-type
                operator: In
                values: ["spot"]
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: ScheduleAnyway
        labelSelector:
          matchLabels:
            app: worker
      terminationGracePeriodSeconds: 30
      containers:
      - name: worker
        # Handle SIGTERM gracefully for spot interruptions
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 5"]
Key: `preferredDuringScheduling` (not required) + zone spread = resilient spot usage.

Drill 6: Cost PromQL Queries

Difficulty: Medium

Q: Write PromQL queries to find: (a) total CPU waste, (b) most over-provisioned namespaces, (c) idle pods.

Answer
# (a) Total CPU waste: requested minus actually used
sum(kube_pod_container_resource_requests{resource="cpu", unit="core"})
-
sum(rate(container_cpu_usage_seconds_total{container!=""}[5m]))

# (b) Namespace CPU overprovisioning ratio
sort_desc(
  sum by(namespace)(kube_pod_container_resource_requests{resource="cpu"})
  /
  sum by(namespace)(rate(container_cpu_usage_seconds_total{container!=""}[5m]))
)
# Ratio > 3 means requesting 3x what's actually used

# (c) Pods with near-zero CPU usage (< 1m) over the last hour
sum by(namespace, pod)(rate(container_cpu_usage_seconds_total{container!=""}[1h])) < 0.001

# (d) Memory waste by namespace
sum by(namespace)(kube_pod_container_resource_requests{resource="memory"})
-
sum by(namespace)(container_memory_working_set_bytes{container!=""})

Drill 7: Karpenter Consolidation

Difficulty: Medium

Q: Configure Karpenter to consolidate underutilized nodes and use multiple instance types for cost optimization.

Answer
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: default
spec:
  template:
    spec:
      requirements:
      - key: karpenter.sh/capacity-type
        operator: In
        values: ["spot", "on-demand"]
      - key: node.kubernetes.io/instance-type
        operator: In
        values:
        - m5.large
        - m5.xlarge
        - m5a.large
        - m5a.xlarge
        - m6i.large
        - m6i.xlarge
        - c5.large
        - c5.xlarge
      - key: topology.kubernetes.io/zone
        operator: In
        values: ["us-east-1a", "us-east-1b", "us-east-1c"]
  disruption:
    consolidationPolicy: WhenUnderutilized
    consolidateAfter: 30s
  limits:
    cpu: "100"
    memory: "400Gi"
`WhenUnderutilized` automatically: - Terminates empty nodes - Replaces nodes with cheaper alternatives - Consolidates pods onto fewer nodes Multiple instance types = Karpenter picks the cheapest available.

Drill 8: Scheduled Scaling

Difficulty: Medium

Q: Scale down dev/staging environments outside business hours (6pm-8am and weekends) to save costs.

Answer
# Using a CronJob to scale down
apiVersion: batch/v1
kind: CronJob
metadata:
  name: scale-down-dev
  namespace: dev
spec:
  schedule: "0 18 * * 1-5"  # 6pm Mon-Fri
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: scaler
          containers:
          - name: kubectl
            image: bitnami/kubectl
            command:
            - /bin/sh
            - -c
            - |
              for deploy in $(kubectl get deploy -n dev -o name); do
                kubectl scale $deploy -n dev --replicas=0
              done
          restartPolicy: OnFailure
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: scale-up-dev
  namespace: dev
spec:
  schedule: "0 8 * * 1-5"  # 8am Mon-Fri
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: scaler
          containers:
          - name: kubectl
            image: bitnami/kubectl
            command:
            - /bin/sh
            - -c
            - |
              for deploy in $(kubectl get deploy -n dev -o name); do
                kubectl scale $deploy -n dev --replicas=1
              done
          restartPolicy: OnFailure
Better approach: Use KEDA or kube-downscaler for annotation-based scheduling.

Drill 9: PVC Cost Audit

Difficulty: Easy

Q: Find unbound or unused PVCs that are costing money.

Answer
# Unbound PVCs (provisioned but not attached)
kubectl get pvc -A | grep -v Bound

# PVCs not mounted by any pod
kubectl get pvc -A -o json | jq -r '
  .items[] |
  select(.status.phase == "Bound") |
  "\(.metadata.namespace)/\(.metadata.name) - \(.spec.resources.requests.storage)"' | \
while read pvc; do
  ns=$(echo $pvc | cut -d/ -f1)
  name=$(echo $pvc | cut -d/ -f2 | cut -d' ' -f1)
  used=$(kubectl get pods -n $ns -o json | jq -r \
    ".items[].spec.volumes[]? | select(.persistentVolumeClaim.claimName == \"$name\")" 2>/dev/null)
  if [ -z "$used" ]; then
    echo "UNUSED: $pvc"
  fi
done

# Check total PV storage provisioned
kubectl get pv -o json | jq '[.items[].spec.capacity.storage | rtrimstr("Gi") | tonumber] | add'

Drill 10: Cost Allocation Tags

Difficulty: Easy

Q: Write a Kyverno policy that requires all Deployments to have a cost-center label for showback/chargeback reporting.

Answer
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-cost-center
spec:
  validationFailureAction: Enforce
  rules:
  - name: check-cost-center
    match:
      any:
      - resources:
          kinds: ["Deployment", "StatefulSet", "DaemonSet"]
    exclude:
      any:
      - resources:
          namespaces: ["kube-system", "kube-public", "monitoring"]
    validate:
      message: "A 'cost-center' label is required for cost allocation. Example: cost-center=engineering-platform"
      pattern:
        metadata:
          labels:
            cost-center: "?*"
Combine with Kubecost or OpenCost to generate reports grouped by `cost-center` label.

Wiki Navigation

Prerequisites