Skip to content

Grading Rubric

Criterion Strong (3) Adequate (2) Weak (1)
Identified misleading symptom Recognized low kubectl top CPU with high process CPU usage as throttling; checked cpu.stat Noticed the CPU numbers did not add up but took time to find the throttle metrics Investigated SMTP, database, or application code for the bottleneck
Found root cause in kubernetes domain Identified CFS throttling from CPU limits on multi-threaded workload Found the CPU limit was too low but not why it worked before the kernel upgrade Assumed the workers needed more replicas or the queue had a consumer bug
Remediated in linux_ops domain Adjusted CFS bandwidth slice sysctl on all nodes; updated CPU limits; applied via Ansible Increased CPU limits but did not fix the kernel-level sysctl Only increased replicas (scaling around the problem, not fixing it)
Cross-domain thinking Explained the full chain: kernel upgrade -> CFS behavior change -> throttling -> worker slowdown -> queue backlog Acknowledged throttling but missed the kernel upgrade connection Treated it as an application performance or capacity planning issue

Prerequisite Topic Packs