Skip to content

Symptoms: Power Supply Redundancy Lost

  • Server k8s-worker-17 (HPE ProLiant DL380 Gen10) triggered a BMC alert: "Power Supply Redundancy Lost."
  • The server is still running on a single PSU (PSU 1). PSU 2 shows "Failed" status.
  • The server is part of a Kubernetes cluster and runs production workloads.
  • No service impact yet, but the server has lost its power redundancy safety margin.
  • Each PSU is rated at 800W; the server's current draw is approximately 520W.
  • PSU 1 is on Power Distribution Unit (PDU) A; PSU 2 was on PDU B (dual-feed for redundancy).
  • The datacenter facility team confirmed PDU B is healthy -- other servers on PDU B are fine.
  • The server has been in production for 18 months.
  • The amber PSU LED on the rear of the chassis is lit for PSU 2.