Skip to content

Portal | Level: L2: Operations | Topics: FinOps | Domain: DevOps & Tooling

FinOps / Cost Optimization - Skill Check

Mental model (bottom-up)

You pay for requests, not usage. Over-requesting = paying for idle resources. Right-sizing requests is the single biggest cost lever. Spot instances and autoscaling are force multipliers.

Visual stack

[Actually Used    ]  what your app needs (100m CPU)
|
[Requested        ]  what you reserved (500m CPU) <-- you pay for this
|
[Limited          ]  what you're allowed to burst to (1000m CPU)
|
[Node Capacity    ]  total available on the node

Glossary

  • requests - guaranteed resources; scheduler uses for placement
  • limits - maximum allowed; OOMKill if memory exceeded, throttle if CPU exceeded
  • VPA - Vertical Pod Autoscaler; recommends/sets resource requests
  • HPA - Horizontal Pod Autoscaler; scales replicas based on metrics
  • spot/preemptible - discounted instances that can be reclaimed
  • Karpenter - flexible node provisioner (AWS); faster than Cluster Autoscaler
  • error budget - related concept for reliability investment decisions

Core questions (easy -> hard)

  • Where does most K8s cost waste come from?
  • Over-provisioned requests. Teams request 5x actual usage.
  • Requests vs limits cost implications?
  • Requests reserve node capacity (you pay). Limits cap burst. Right-size requests.
  • What does VPA do?
  • Watches usage, recommends optimal requests. Modes: Off/Initial/Auto.
  • When to use spot instances?
  • Stateless, multi-replica, fault-tolerant workloads. Not for single-replica DBs.
  • How to prevent over-provisioning?
  • ResourceQuotas, LimitRanges, VPA, cost dashboards, chargeback.
  • Should you set CPU limits?
  • Usually no. CPU limits cause throttling even with idle CPU. Set memory limits always.

Wiki Navigation

Prerequisites