Pattern: Clock Skew Ordering¶
ID: FP-017 Family: Split Brain Frequency: Common Blast Radius: Multi-Service Detection Difficulty: Actively Misleading
The Shape¶
Distributed systems that use wall-clock timestamps (from different servers) to order events assume all clocks agree. They don't. NTP drift, leap seconds, clock adjustments, or VM pauses can cause one node's clock to be seconds or minutes ahead of another's. An "earlier" timestamp from an advanced clock represents a later real-world event; sequence reconstruction is incorrect. Results: wrong event ordering, incorrect "last write wins" resolution, expired TLS certificates that aren't expired.
How You'll See It¶
In Kubernetes¶
HPA (Horizontal Pod Autoscaler) makes scaling decisions using timestamps. If one node
has a skewed clock, the HPA may see metric samples in the wrong order, making the wrong
scaling decision. More commonly: certificate expiry/validity checks fail if the node's
clock is ahead of the certificate's notBefore time.
In Linux/Infrastructure¶
Two application servers write to a shared log aggregation system with timestamps. Server A's clock is 90 seconds ahead. Log entries from Server A appear to come after Server B's entries even when they preceded them. Post-incident timeline reconstruction is wrong; the "first" event (Server A) appears to come after the "second" (Server B).
In Networking¶
BGP hold-timer calculation uses system time. If two BGP peers have significant clock skew, hold-timer expiry may appear to occur before expected, causing spurious BGP session drops. TLS certificate validation ("is the current time within the cert's validity window?") fails on hosts with significant clock skew.
In Datacenter¶
BMC (IPMI) logs have wrong timestamps after a CMOS battery failure. Reconstructing the timeline of a hardware failure from IPMI SEL logs gives incorrect event ordering.
The Tell¶
Events from one system appear in the wrong chronological order relative to another.
timedatectl statusorchronyc trackingshows clock is not synchronized. Certificate errors on a host with a clock far in the future or far in the past.ntpstatreturns "unsynchronised."
Common Misdiagnosis¶
| Looks Like | But Actually | How to Tell the Difference |
|---|---|---|
| Application race condition | Clock skew | Timestamps from two systems agree on sequence; clock on one system is wrong |
| Intermittent TLS failure | Clock outside cert validity window | openssl x509 -noout -dates -in cert.pem and compare to host's current time |
| Spurious BGP drops | Clock skew affecting hold-timer | ntpstat shows unsynchronized; no packet loss on the link |
The Fix (Generic)¶
- Immediate: Sync the clock:
chronyc makestep(step immediately to correct time) orntpdate -u pool.ntp.org. - Short-term: Ensure NTP/chrony is running and has at least 3 time sources; check that firewall allows UDP 123.
- Long-term: Use logical clocks (Lamport timestamps, vector clocks) for event ordering instead of wall-clock time; monitor clock skew with
node_timex_offset_secondsin Prometheus and alert at >100ms.
Real-World Examples¶
- Example 1: CMOS battery on a physical server failed overnight. After reboot, system clock was set to 2000-01-01. All JWT tokens (validated by expiry time) were considered valid forever (expiry was in 2025, far in the "future"). Security bypass.
- Example 2: VM paused for live migration. On resume, clock was 90 seconds behind real time. During those 90 seconds, the VM's Kerberos tickets expired (clock check). Services using Kerberos auth returned 401 until NTP resynced.
War Story¶
HPA was flapping — scaling up and down every minute. We stared at CPU metrics for an hour. Finally checked
timedatectlon each node: one node was 47 seconds ahead. The HPA was receiving CPU samples with timestamps that appeared out of order; its moving average calculation was using the wrong window.chronyc makestepon the drifted node fixed the HPA flapping immediately. We added a Prometheus alert:abs(node_timex_offset_seconds) > 0.1pages the on-call.
Cross-References¶
- Topic Packs: distributed-systems, datacenter
- Case Studies: cross-domain/hpa-flapping-clock-skew-ntp/, datacenter_ops/bmc-clock-skew-cert-failure/
- Related Patterns: FP-015 (stale leader — clock skew can prevent correct leader detection), FP-016 (dual-write divergence — wrong clock causes wrong "last write wins" resolution)