Grading Rubric¶

Criterion	Strong (3)	Adequate (2)	Weak (1)
Identified misleading symptom	Quickly found disk consumers with du; recognized Loki + verbose logging as the cause, not a generic disk issue	Found the large directories but took time to connect Loki retention to the problem	Focused on logrotate, tmp files, or core dumps; missed the Loki storage angle
Found root cause in observability domain	Identified both Loki retention disabled and event-processor DEBUG logging	Found one of the two issues but not both	Assumed it was purely a Linux disk management problem
Remediated in devops_tooling domain	Updated Helm values for both Loki retention and log level; cleaned up stale data	Fixed one component via Helm but cleaned up the other manually	Manually deleted files without fixing the underlying configuration
Cross-domain thinking	Explained how observability infrastructure competes for node resources and how Helm config drives Loki behavior	Acknowledged multiple systems were involved	Treated it as a single-domain disk space issue

Prerequisite Topic Packs¶

disk-and-storage-ops — needed for Domain A investigation (df, du, disk usage analysis)
linux-logging — needed for Domain A investigation (container logs, log rotation)
log-pipelines — needed for Domain B root cause (Loki architecture, retention, ingester storage)
helm — needed for Domain C remediation (Helm values, upgrades)
k8s-node-lifecycle — needed for understanding DiskPressure taint and pod eviction