Anti-Primer: DNSSEC¶
Everything that can go wrong, will — and in this story, it does.
The Setup¶
A security team mandates DNSSEC for all external domains. The DNS admin has never configured DNSSEC before and follows a blog post from 2019. The signing must be complete before the audit next week.
The Timeline¶
Hour 0: Key Rollover Not Planned¶
Signs the zone with a KSK but has no plan for key rollover. The deadline was looming, and this seemed like the fastest path forward. But the result is key expires; the entire zone becomes unresolvable by DNSSEC-validating resolvers.
Footgun #1: Key Rollover Not Planned — signs the zone with a KSK but has no plan for key rollover, leading to key expires; the entire zone becomes unresolvable by DNSSEC-validating resolvers.
Nobody notices yet. The engineer moves on to the next task.
Hour 1: DS Record Not Published¶
Signs the zone but forgets to publish the DS record at the parent (registrar). Under time pressure, the team chose speed over caution. But the result is chain of trust is broken; DNSSEC validation fails for all resolvers.
Footgun #2: DS Record Not Published — signs the zone but forgets to publish the DS record at the parent (registrar), leading to chain of trust is broken; DNSSEC validation fails for all resolvers.
The first mistake is still invisible, making the next shortcut feel justified.
Hour 2: Clock Skew Invalidates Signatures¶
Signing server has NTP disabled; clock drifts by 10 minutes. Nobody pushed back because the shortcut looked harmless in the moment. But the result is RRSIG validity windows are off; resolvers reject signatures as expired or not-yet-valid.
Footgun #3: Clock Skew Invalidates Signatures — signing server has NTP disabled; clock drifts by 10 minutes, leading to RRSIG validity windows are off; resolvers reject signatures as expired or not-yet-valid.
Pressure is mounting. The team is behind schedule and cutting more corners.
Hour 3: Algorithm Mismatch¶
Uses an algorithm not supported by the registrar's DS record submission form. The team had gotten away with similar shortcuts before, so nobody raised a flag. But the result is cannot publish the DS record; DNSSEC is partially configured and worse than unsigned.
Footgun #4: Algorithm Mismatch — uses an algorithm not supported by the registrar's DS record submission form, leading to cannot publish the DS record; DNSSEC is partially configured and worse than unsigned.
By hour 3, the compounding failures have reached critical mass. Pages fire. The war room fills up. The team scrambles to understand what went wrong while the system burns.
The Postmortem¶
Root Cause Chain¶
| # | Mistake | Consequence | Could Have Been Prevented By |
|---|---|---|---|
| 1 | Key Rollover Not Planned | Key expires; the entire zone becomes unresolvable by DNSSEC-validating resolvers | Primer: Plan key rollover schedule and automate with tools like OpenDNSSEC |
| 2 | DS Record Not Published | Chain of trust is broken; DNSSEC validation fails for all resolvers | Primer: Publish the DS record at the parent zone and verify the chain with dig +dnssec |
| 3 | Clock Skew Invalidates Signatures | RRSIG validity windows are off; resolvers reject signatures as expired or not-yet-valid | Primer: NTP on all DNS servers; monitor clock drift as a critical metric |
| 4 | Algorithm Mismatch | Cannot publish the DS record; DNSSEC is partially configured and worse than unsigned | Primer: Verify algorithm support with the registrar before generating keys |
Damage Report¶
- Downtime: 1-4 hours of connectivity loss or degraded throughput
- Data loss: None directly, but dependent services may lose in-flight data
- Customer impact: Timeouts, connection failures, or complete network unreachability
- Engineering time to remediate: 8-16 engineer-hours including physical layer verification
- Reputation cost: Network team credibility damaged; possible SLA credits to internal customers
What the Primer Teaches¶
- Footgun #1: If the engineer had read the primer, section on key rollover not planned, they would have learned: Plan key rollover schedule and automate with tools like OpenDNSSEC.
- Footgun #2: If the engineer had read the primer, section on ds record not published, they would have learned: Publish the DS record at the parent zone and verify the chain with dig +dnssec.
- Footgun #3: If the engineer had read the primer, section on clock skew invalidates signatures, they would have learned: NTP on all DNS servers; monitor clock drift as a critical metric.
- Footgun #4: If the engineer had read the primer, section on algorithm mismatch, they would have learned: Verify algorithm support with the registrar before generating keys.
Cross-References¶
- Primer — The right way
- Footguns — The mistakes catalogued
- Street Ops — How to do it in practice