Skip to content

Symptoms: Thermal Throttle - Fan Failure

  • Server compute-batch-09 (Dell PowerEdge R740) is showing degraded performance for batch processing jobs.
  • Jobs that normally complete in 2 hours are now taking 4-5 hours.
  • CPU utilization appears normal (80-90%) but throughput is halved.
  • BMC alert was received: "Fan 3 RPM below minimum threshold" -- logged 2 days ago but was not actioned.
  • A second alert appeared today: "CPU 1 temperature above warning threshold (85C)."
  • The server is in a well-cooled datacenter (ambient 21C at cold aisle).
  • No workload changes or software updates have been deployed.
  • The server has been running continuously for 14 months since last hardware maintenance.