Grading Rubric¶

Criterion	Strong (3)	Adequate (2)	Weak (1)
Identified misleading symptom	Noticed inconsistent `kubectl top` output; checked Metrics Server logs for errors within 10 min	Investigated HPA config and scaling policies first, then noticed metric inconsistency	Spent extended time tuning HPA parameters or investigating the application
Found root cause in observability domain	Identified clock skew corrupting CPU utilization calculations in Metrics Server	Found the Metrics Server errors but not the clock skew root cause	Assumed the Metrics Server was buggy or needed a restart
Remediated in linux_ops domain	Fixed NTP on affected nodes; updated bootstrap script to prevent recurrence	Fixed NTP on the immediate nodes but did not update the AMI/bootstrap	Restarted the Metrics Server or adjusted HPA settings (workaround)
Cross-domain thinking	Explained the full chain: NTP failure -> clock drift -> metrics corruption -> HPA flapping	Acknowledged clock skew and HPA interaction but missed the ntpd/chronyd conflict	Treated it as a Kubernetes HPA tuning problem

Prerequisite Topic Packs¶

k8s-ops (HPA) — needed for Domain A investigation (HPA behavior, scaling policies, stabilization windows)
monitoring-fundamentals — needed for Domain B root cause (rate-based metrics, Metrics Server, time-series accuracy)
linux-ops — needed for Domain C remediation (NTP, chronyd, systemd services)
cron-scheduling — useful for understanding periodic metric collection
k8s-debugging-playbook — useful for systematic k8s troubleshooting