Skip to content

Grading Rubric

Criterion Strong (3) Adequate (2) Weak (1)
Identified misleading symptom Noticed inconsistent kubectl top output; checked Metrics Server logs for errors within 10 min Investigated HPA config and scaling policies first, then noticed metric inconsistency Spent extended time tuning HPA parameters or investigating the application
Found root cause in observability domain Identified clock skew corrupting CPU utilization calculations in Metrics Server Found the Metrics Server errors but not the clock skew root cause Assumed the Metrics Server was buggy or needed a restart
Remediated in linux_ops domain Fixed NTP on affected nodes; updated bootstrap script to prevent recurrence Fixed NTP on the immediate nodes but did not update the AMI/bootstrap Restarted the Metrics Server or adjusted HPA settings (workaround)
Cross-domain thinking Explained the full chain: NTP failure -> clock drift -> metrics corruption -> HPA flapping Acknowledged clock skew and HPA interaction but missed the ntpd/chronyd conflict Treated it as a Kubernetes HPA tuning problem

Prerequisite Topic Packs

  • k8s-ops (HPA) — needed for Domain A investigation (HPA behavior, scaling policies, stabilization windows)
  • monitoring-fundamentals — needed for Domain B root cause (rate-based metrics, Metrics Server, time-series accuracy)
  • linux-ops — needed for Domain C remediation (NTP, chronyd, systemd services)
  • cron-scheduling — useful for understanding periodic metric collection
  • k8s-debugging-playbook — useful for systematic k8s troubleshooting