Grading Checklist¶
- Identified the overloaded phase and quantified the imbalance across all three phases
- Reviewed per-outlet power readings to identify the top power consumers on L2
- Connected the new GPU server installation to the overload event (timing correlation)
- Proposed immediate load shedding or redistribution plan to bring L2 below 80%
- Considered the redundant PDU (PDU-B) loading to avoid creating a new problem
- Addressed the risk of a cascading failure if PDU-A trips (all load shifts to PDU-B)
- Suggested rebalancing server power connections across phases
- Recommended updating capacity planning documentation to reflect actual power draw
- Mentioned checking actual vs. nameplate power draw for the new GPU servers
- Proposed a longer-term fix: relocating high-draw servers or adding circuit capacity
- Considered power capping (BIOS/BMC power limits) as a temporary mitigation
- Noted that the branch circuit breaker being warm indicates sustained high current