Grading Rubric¶

Criterion	Strong (3)	Adequate (2)	Weak (1)
Identified misleading symptom	Checked disk I/O after ruling out database-level blockers; recognized I/O saturation within 10 min	Investigated PostgreSQL settings and queries first, then checked I/O	Spent extended time on recovery parameters, vacuum, or query tuning
Found root cause in linux_ops domain	Identified RAID degradation via `/proc/mdstat` and SMART failures	Found the disk I/O issue but not the RAID degradation	Assumed the disk was just slow (aging hardware) without checking RAID
Remediated in datacenter domain	Replaced the failed disk, rebuilt RAID, verified SMART on new disk	Identified the need for disk replacement but did not guide the rebuild	Tried to tune PostgreSQL or increase replica resources instead
Cross-domain thinking	Explained the full chain: disk failure -> RAID degradation -> I/O bottleneck -> WAL replay stall -> replication lag	Acknowledged the hardware/database connection but missed the RAID detail	Treated it as a single-domain database or Kubernetes issue

Prerequisite Topic Packs¶

database-ops — needed for Domain A investigation (PostgreSQL replication, WAL, pg_stat_replication)
k8s-storage — needed for Domain A (PVC, local storage, node storage)
disk-and-storage-ops — needed for Domain B root cause (iostat, RAID, SMART)
disk-and-storage-ops — needed for Domain C remediation (RAID rebuild, disk replacement)
linux-performance — needed for Domain B (I/O analysis, disk utilization)