What Happens When Your Certificate Expires
- lesson
- tls-lifecycle
- cert-manager
- acme/let's-encrypt
- hsts
- clock-skew
- monitoring
- l2 ---# What Happens When Your Certificate Expires at 3am
Topics: TLS lifecycle, cert-manager, ACME/Let's Encrypt, HSTS, clock skew, monitoring Level: L2 (Operations) Time: 45–60 minutes Prerequisites: Basic TLS awareness helpful (see "What Happens When You Click a Link")
The Mission¶
3:07 AM. PagerDuty fires: "SSL certificate expired for app.example.com." Every user sees a full-page browser warning. There's no "proceed anyway" for HSTS-enabled sites. Your app is effectively down — not because anything is broken, but because a file expired.
Certificate expiry is uniquely frustrating: the fix is simple (renew the cert), the impact is total (100% of HTTPS traffic), and the cause is always "nobody was watching the expiration date."
Why Certificates Expire¶
Certificates have expiration dates because revocation doesn't work well:
- CRLs (Certificate Revocation Lists) require clients to download and check a list — most don't. The lists are huge and slow.
- OCSP (Online Certificate Status Protocol) requires clients to check with the CA in real time — adds latency and the CA becomes a single point of failure.
Since revocation is unreliable, short lifetimes limit the damage window. If a private key is compromised and the cert expires in 90 days, the attacker has 90 days of use — not 10 years.
The industry has progressively shortened maximum certificate validity:
2012: 5 years (60 months)
2015: 3 years (39 months)
2018: 2 years (825 days)
2020: 1 year (398 days)
2025: 47 days (proposed by Apple, adopted by CA/Browser Forum)
Trivia: Let's Encrypt certificates have always been 90-day, pushing the industry toward automation. Before Let's Encrypt (2015), certificates cost $10-$300/year and were manually installed. Let's Encrypt has issued over 4 billion certificates, making HTTPS the default rather than the exception.
The HSTS Trap¶
If your site has ever sent this header:
...then browsers will refuse to connect over HTTP for 2 years (63072000 seconds). If the cert expires, browsers don't fall back to HTTP — they show an error with NO bypass option.
Your connection is not private
NET::ERR_CERT_DATE_INVALID
[There is no "Proceed" button — HSTS prevents it]
With HSTS, an expired certificate = complete outage. Without HSTS, users can click through the warning (bad practice, but at least the site works).
Gotcha: HSTS with
includeSubDomainsandpreloadis nearly irreversible. Once your domain is in the browser HSTS preload list, removing it takes months (browser release cycles). If you lose the ability to serve HTTPS on any subdomain, that subdomain is completely unreachable.
Automated Certificate Management¶
cert-manager (Kubernetes)¶
cert-manager watches for Certificate resources and automatically obtains and renews them from Let's Encrypt (or other ACME CAs):
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: app-tls
spec:
secretName: app-tls-secret
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
dnsNames:
- app.example.com
- api.example.com
renewBefore: 360h # Renew 15 days before expiry
cert-manager handles the entire lifecycle: create, renew, store in K8s Secret. Your Ingress just references the Secret. Renewal is automatic.
# Check certificate status
kubectl get certificate
# → NAME READY SECRET AGE
# → app-tls True app-tls-secret 45d
# Check when it expires
kubectl describe certificate app-tls | grep "Not After"
# → Not After: 2026-06-20T14:23:00Z
# Force renewal
kubectl delete secret app-tls-secret
# cert-manager detects missing secret and re-issues
Certbot (non-Kubernetes)¶
# Initial certificate
certbot --nginx -d app.example.com
# Auto-renewal (certbot installs a systemd timer)
systemctl list-timers | grep certbot
# → certbot.timer ... 2x daily
# Test renewal
certbot renew --dry-run
Gotcha: certbot's HTTP-01 challenge requires port 80 to be reachable from the internet. If your firewall blocks port 80, renewal fails silently. Use DNS-01 challenge for internal services or wildcard certificates.
Clock Skew: The Invisible Certificate Killer¶
Certificates have "Not Before" and "Not After" timestamps. If your server's clock is wrong, a perfectly valid certificate appears expired (or not yet valid).
# Check system time
timedatectl
# → System clock synchronized: yes
# → NTP service: active
# Force NTP sync
chronyc makestep
# or
timedatectl set-ntp true
Container clocks inherit from the host. If the host's NTP is broken, every container on that host sees wrong time → certificate validation fails for all outbound HTTPS connections.
War Story: A datacenter's NTP server drifted 5 minutes ahead after a firmware update disabled NTP synchronization. Gradually, BMC management certificates "expired" (they hadn't — the clock was wrong). Over 72 hours, more and more management interfaces became unreachable. The monitoring system detected the drift but the alert was buried in a Slack channel with 800 alerts per day. By the time someone investigated, 30% of server management was inaccessible.
Monitoring Certificate Expiry¶
# Check a remote certificate's expiration
echo | openssl s_client -connect app.example.com:443 2>/dev/null | \
openssl x509 -noout -dates
# → notBefore=Mar 22 14:23:00 2026 GMT
# → notAfter=Jun 20 14:23:00 2026 GMT
# Check days until expiry
echo | openssl s_client -connect app.example.com:443 2>/dev/null | \
openssl x509 -noout -checkend 2592000
# Exit code 0 = valid for at least 30 more days
# Exit code 1 = expires within 30 days
Prometheus monitoring¶
# Alert when cert expires in < 14 days
- alert: CertificateExpiringSoon
expr: probe_ssl_earliest_cert_expiry - time() < 14 * 86400
for: 1h
labels:
severity: warning
annotations:
summary: "Certificate for {{ $labels.instance }} expires in {{ $value | humanizeDuration }}"
The Diagnostic Ladder¶
"SSL certificate expired" / ERR_CERT_DATE_INVALID
│
├── Is the cert actually expired?
│ echo | openssl s_client -connect host:443 | openssl x509 -dates
│
├── Is the clock wrong?
│ timedatectl → NTP synchronized?
│ date → is system time correct?
│
├── Is it the right cert?
│ openssl s_client shows cert chain → correct hostname in SAN?
│ └── Wrong cert loaded? Check Ingress/Nginx config
│
├── Is auto-renewal failing?
│ kubectl describe certificate → check events
│ certbot renew --dry-run → test renewal
│ └── ACME challenge failing? Check port 80 / DNS
│
└── HSTS preventing access?
Browsers cache HSTS → can't bypass warning
Fix: renew cert (only option)
Flashcard Check¶
Q1: Why do certificates expire instead of lasting forever?
Because revocation doesn't work reliably (CRLs too large, OCSP adds latency). Short lifetimes limit the damage window of a compromised private key.
Q2: HSTS is set. Certificate expires. What happens?
Browser shows error with NO bypass option. Complete outage for HTTPS traffic. Must renew the certificate — there is no workaround.
Q3: cert-manager renews certificates automatically. What triggers renewal?
renewBeforesetting (default 30 days before expiry). cert-manager watches Certificate resources and re-issues when renewal time arrives.
Q4: Server clock is 5 minutes ahead. What breaks?
Certificate validation. A valid cert appears "expired" if the server clock is past the "Not After" timestamp. Fix: enable NTP (
timedatectl set-ntp true).
Cheat Sheet¶
| Task | Command |
|---|---|
| Check cert expiry | echo \| openssl s_client -connect host:443 2>/dev/null \| openssl x509 -dates |
| Check cert SAN | echo \| openssl s_client -connect host:443 2>/dev/null \| openssl x509 -noout -ext subjectAltName |
| Check cert chain | openssl s_client -connect host:443 -showcerts |
| Test renewal | certbot renew --dry-run |
| K8s cert status | kubectl get certificate |
| Force K8s renewal | kubectl delete secret CERT_SECRET_NAME |
| Check NTP | timedatectl |
| Force NTP sync | chronyc makestep |
Takeaways¶
-
Automate certificate renewal. cert-manager for Kubernetes, certbot for everything else. Manual renewal is a guaranteed 3am page.
-
HSTS makes expiry catastrophic. No bypass, no fallback. If you enable HSTS, your renewal automation MUST be bulletproof.
-
Monitor expiry with alerts. 14-day warning (time to fix) and 3-day critical (fix NOW). Don't discover expiry from user complaints.
-
Check the clock. NTP drift causes certificate validation failures that look exactly like expiry.
timedatectlis the first check. -
The industry is moving to 47-day certificates. Manual renewal is becoming impossible. Automation is not optional — it's the only path.
Related Lessons¶
- What Happens When You Click a Link — TLS handshake and certificate chain verification
- Why DNS Is Always the Problem — DNS-01 challenge for cert renewal
- How Incident Response Actually Works — when the cert page fires at 3am