Portal | Level: L2: Operations | Topics: FinOps | Domain: DevOps & Tooling
FinOps / Cost Optimization - Skill Check¶
Mental model (bottom-up)¶
You pay for requests, not usage. Over-requesting = paying for idle resources. Right-sizing requests is the single biggest cost lever. Spot instances and autoscaling are force multipliers.
Visual stack¶
[Actually Used ] what your app needs (100m CPU)
|
[Requested ] what you reserved (500m CPU) <-- you pay for this
|
[Limited ] what you're allowed to burst to (1000m CPU)
|
[Node Capacity ] total available on the node
Glossary¶
- requests - guaranteed resources; scheduler uses for placement
- limits - maximum allowed; OOMKill if memory exceeded, throttle if CPU exceeded
- VPA - Vertical Pod Autoscaler; recommends/sets resource requests
- HPA - Horizontal Pod Autoscaler; scales replicas based on metrics
- spot/preemptible - discounted instances that can be reclaimed
- Karpenter - flexible node provisioner (AWS); faster than Cluster Autoscaler
- error budget - related concept for reliability investment decisions
Core questions (easy -> hard)¶
- Where does most K8s cost waste come from?
- Over-provisioned requests. Teams request 5x actual usage.
- Requests vs limits cost implications?
- Requests reserve node capacity (you pay). Limits cap burst. Right-size requests.
- What does VPA do?
- Watches usage, recommends optimal requests. Modes: Off/Initial/Auto.
- When to use spot instances?
- Stateless, multi-replica, fault-tolerant workloads. Not for single-replica DBs.
- How to prevent over-provisioning?
- ResourceQuotas, LimitRanges, VPA, cost dashboards, chargeback.
- Should you set CPU limits?
- Usually no. CPU limits cause throttling even with idle CPU. Set memory limits always.
Wiki Navigation¶
Prerequisites¶
- FinOps & Cost Optimization (Topic Pack, L2)
Related Content¶
- FinOps & Cost Optimization (Topic Pack, L2) — FinOps
- FinOps Drills (Drill, L2) — FinOps
- Finops Flashcards (CLI) (flashcard_deck, L1) — FinOps
- Interview: Cost Spike Investigation (Scenario, L2) — FinOps