Skip to content

Lab 17: Performance Tuning

Field Value
Tier 4 — Advanced
Estimated Time 90 minutes
Prerequisites k3s cluster
Auto-Grade Yes

Scenario

The product team reports that page load times have increased from 200ms to over 3 seconds in the past month. Users are churning and the CEO has escalated this to a P1. The application hasn't changed — the same code runs on the same infrastructure. The problem is resource configuration, not code.

Your investigation needs to find and fix five performance bottlenecks hidden in the Kubernetes deployment configuration: under-provisioned CPU requests causing throttling, memory limits too close to actual usage causing OOM pressure, missing horizontal pod autoscaler during peak traffic, missing pod disruption budget causing restarts during rolling updates, and a misconfigured liveness probe that is too aggressive and causes unnecessary restarts.

Objectives

  • Fix CPU requests/limits that cause throttling (requests too low, limits too tight)
  • Fix memory limits that cause OOM pressure (increase headroom to 2x average usage)
  • Add HPA to handle traffic spikes (min 2, max 10, CPU target 60%)
  • Add PodDisruptionBudget (minAvailable: 1) to prevent update-caused downtime
  • Fix the liveness probe timing (increase period and failure threshold)
  • Verify no pods are in CrashLoopBackOff or being throttled
  • Document findings in /tmp/lab-perf/tuning-report.txt

Setup

./setup.sh

Deploys a poorly configured application in namespace lab-perf.

Hints

Hint 1: Detecting CPU throttling Check `kubectl top pods -n lab-perf`. If CPU usage is near the limit, the container is being throttled. Increase the CPU limit or set requests appropriately.
Hint 2: Memory headroom Rule of thumb: set memory limits to 1.5-2x the average usage. Check current usage with `kubectl top pods` and compare to limits in the deployment spec.
Hint 3: HPA configuration
kubectl autoscale deployment app -n lab-perf --min=2 --max=10 --cpu-percent=60
Hint 4: PodDisruptionBudget
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: app-pdb
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: webapp
Hint 5: Probe tuning A liveness probe with `periodSeconds: 3` and `failureThreshold: 1` will kill pods at the slightest hiccup. Use `periodSeconds: 10` and `failureThreshold: 3` for production workloads.

Grading

./grade.sh

Solution

See the solution/ directory for optimized deployment manifests.