Solution: BMC Clock Skew - Certificate Failure¶
Triage¶
-
Check the BMC's current date/time:
-
Check the certificate validity dates:
-
Compare: if the BMC clock is set to a date before the certificate's "Not Before" date, TLS clients will reject it as "not yet valid."
Root Cause¶
The server was powered off for 3 weeks during the datacenter power upgrade. The BMC's real-time clock (RTC) is backed by a small battery (CR2032 or similar). After extended power loss, the BMC clock either: - Drifted significantly due to a weak battery, or - Reset to a factory default date (e.g., January 1, 2000)
When the server was powered back on, the BMC booted with the wrong date. The self-signed SSL certificate (valid from 2024-06-15 to 2029-06-15) appears "not yet valid" because the BMC thinks it is January 2000. The host OS corrected its clock via NTP at boot, but the BMC has its own independent clock that was not corrected.
Fix¶
-
Set the BMC clock to the correct time:
-
Configure NTP on the BMC so the clock stays synchronized:
-
Verify the fix:
-
If the certificate has genuinely expired (not just a clock issue), regenerate it:
-
Re-enable monitoring:
- Verify Redfish API calls work from monitoring server.
- Check that historical data gap is documented.
Rollback / Safety¶
- Setting the BMC clock and configuring NTP are non-disruptive operations. They do not affect the host OS.
- If you regenerate the SSL certificate, the iLO will reset (1-2 minute interruption to management access only; host OS unaffected).
- Any scripts or tools that pin the old certificate thumbprint will need updating after certificate regeneration.
Common Traps¶
- Trap: Regenerating the certificate without fixing the clock first. If the BMC clock is still wrong, the new certificate will also be generated with wrong dates.
- Trap: Using
curl -k(skip validation) as a permanent workaround. This disables certificate validation and masks the real issue. - Trap: Not configuring NTP on the BMC. Without NTP, the BMC clock will drift again over time, especially if the RTC battery is weak.
- Trap: Forgetting to check other servers that were offline during the same maintenance. If one BMC has clock issues, others likely do too.
- Trap: Not testing that monitoring scripts actually resume after the fix. The scripts may have error-handling that requires a manual restart.