Portal | Level: L2: Operations | Topics: TLS & PKI | Domain: Security
TLS & PKI Drills¶
Remember: The TLS handshake in 4 steps: ClientHello (supported ciphers + SNI hostname) -> ServerHello (chosen cipher + certificate) -> Key Exchange (agree on session key) -> Encrypted Data. Most TLS failures happen at step 2: wrong cert, expired cert, or CA not trusted. Mnemonic: "CSKE" — Client, Server, Key, Encrypted.
Debug clue:
openssl s_client -connect host:443 -servername hostis the Swiss Army knife for TLS debugging. It shows the full certificate chain, expiry dates, cipher negotiated, and any verification errors. Add-showcertsto see intermediate certificates — a missing intermediate is the #1 cause of "works in Chrome, fails in curl."Gotcha: cert-manager's
Certificateresource creates aSecretcontainingtls.crtandtls.key. If you delete the Secret manually, cert-manager recreates it — but if you delete theCertificateresource, the Secret is orphaned and stops being renewed. Always manage theCertificateresource, not the Secret directly.
Drill 1: Check Certificate Expiry¶
Difficulty: Easy
Q: Check when the TLS certificate for a Kubernetes Secret myapp-tls expires.
Answer
Output shows `notBefore` and `notAfter` dates.Drill 2: Check Live Server Certificate¶
Difficulty: Easy
Q: Check the TLS certificate of a live server at api.example.com:443 from the command line.
Answer
# View cert details
openssl s_client -connect api.example.com:443 -servername api.example.com </dev/null 2>/dev/null | \
openssl x509 -noout -text
# Just expiry dates
openssl s_client -connect api.example.com:443 -servername api.example.com </dev/null 2>/dev/null | \
openssl x509 -noout -dates
# Check the full chain
openssl s_client -connect api.example.com:443 -servername api.example.com -showcerts </dev/null
Drill 3: Verify Key Matches Certificate¶
Difficulty: Easy
Q: How do you verify that a private key file matches a certificate file?
Answer
This is essential when debugging "certificate/key mismatch" errors in Ingress or load balancers.Drill 4: Create a cert-manager Certificate¶
Difficulty: Medium
Q: Write a cert-manager Certificate resource for app.example.com and www.example.com using a Let's Encrypt ClusterIssuer, with auto-renewal 15 days before expiry.
Answer
cert-manager creates a CertificateRequest → Order → Challenge, then stores the signed cert in `app-tls-secret`.Drill 5: Debug cert-manager Renewal Failure¶
Difficulty: Hard
Q: A Certificate shows READY: False and hasn't renewed. Walk through the debugging chain.
Answer
# 1. Certificate status
kubectl describe certificate app-tls -n production
# Look for: conditions, lastTransitionTime, message
# 2. CertificateRequest
kubectl get certificaterequest -n production
kubectl describe certificaterequest <name> -n production
# Look for: conditions, approval status
# 3. Order (ACME)
kubectl get orders -n production
kubectl describe order <name> -n production
# 4. Challenge (where it usually fails)
kubectl get challenges -n production
kubectl describe challenge <name> -n production
# Look for: state, reason, presented
# 5. cert-manager controller logs
kubectl logs -n cert-manager deploy/cert-manager --tail=200 | grep -i error
# 6. Force renewal
kubectl cert-manager renew app-tls -n production
Drill 6: HTTP-01 vs DNS-01 Challenge¶
Difficulty: Easy
Q: When would you use DNS-01 instead of HTTP-01 for ACME challenges?
Answer
Use DNS-01 when: - You need **wildcard certificates** (`*.example.com`) — only DNS-01 supports this - The cluster is **not publicly accessible** (private/internal clusters) - Port 80 is **blocked** by firewall or security policy - You're behind a **CDN** that caches the challenge path Use HTTP-01 when: - Simple setup, cluster is publicly accessible on port 80 - You don't have DNS API access - Quick setup without DNS provider integrationDrill 7: Internal CA with cert-manager¶
Difficulty: Medium
Q: Set up cert-manager to issue certificates from an internal CA for service-to-service TLS within the cluster.
Answer
# 1. Create a self-signed issuer to bootstrap
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: selfsigned-bootstrap
spec:
selfSigned: {}
---
# 2. Create the CA certificate
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: internal-ca
namespace: cert-manager
spec:
isCA: true
secretName: internal-ca-secret
commonName: Internal CA
duration: 87600h # 10 years
issuerRef:
name: selfsigned-bootstrap
kind: ClusterIssuer
---
# 3. Create a CA issuer using the CA cert
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: internal-ca-issuer
spec:
ca:
secretName: internal-ca-secret
Drill 8: Ingress TLS Termination¶
Difficulty: Easy
Q: Configure an Ingress to terminate TLS using cert-manager auto-provisioning.
Answer
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: myapp
namespace: production
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
ingressClassName: nginx
tls:
- hosts:
- app.example.com
secretName: app-tls-secret # cert-manager creates this
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: myapp
port:
number: 80
Drill 9: Cert Expiry Alerting¶
Difficulty: Medium
Q: Write a Prometheus alert rule that fires when any cert-manager certificate will expire within 14 days.
Answer
groups:
- name: tls-alerts
rules:
- alert: CertificateExpiringSoon
expr: |
certmanager_certificate_expiration_timestamp_seconds - time() < 14 * 24 * 3600
for: 1h
labels:
severity: warning
annotations:
summary: "Certificate {{ $labels.name }} in {{ $labels.namespace }} expires in < 14 days"
description: "Expires at {{ $value | humanizeTimestamp }}"
- alert: CertificateExpiryCritical
expr: |
certmanager_certificate_expiration_timestamp_seconds - time() < 3 * 24 * 3600
for: 10m
labels:
severity: critical
annotations:
summary: "Certificate {{ $labels.name }} expires in < 3 days!"
- alert: CertificateNotReady
expr: |
certmanager_certificate_ready_status{condition="False"} == 1
for: 30m
labels:
severity: warning
annotations:
summary: "Certificate {{ $labels.name }} is not ready"
Drill 10: Debug "Certificate Not Valid For" Error¶
Difficulty: Medium
Q: curl returns SSL: certificate subject name 'old.example.com' does not match target host name 'app.example.com'. How do you fix this?
Answer
# 1. Check the current cert's SANs (Subject Alternative Names)
kubectl get secret app-tls-secret -n production -o jsonpath='{.data.tls\.crt}' | \
base64 -d | openssl x509 -noout -text | grep -A1 "Subject Alternative Name"
# 2. Check the Certificate resource
kubectl get certificate app-tls -n production -o yaml | grep -A5 dnsNames
# 3. Fix: update the Certificate to include the correct hostname
kubectl edit certificate app-tls -n production
# Add app.example.com to dnsNames
# 4. Delete the old secret to force re-issuance
kubectl delete secret app-tls-secret -n production
# 5. Or force renewal
kubectl cert-manager renew app-tls -n production
# 6. Verify new cert
kubectl get secret app-tls-secret -n production -o jsonpath='{.data.tls\.crt}' | \
base64 -d | openssl x509 -noout -text | grep -A1 "Subject Alternative Name"
Wiki Navigation¶
Prerequisites¶
- TLS & Certificates Ops (Topic Pack, L1)
Related Content¶
- Case Study: BMC Clock Skew Cert Failure (Case Study, L2) — TLS & PKI
- Case Study: DNS Looks Broken — TLS Expired, Fix Is Cert-Manager (Case Study, L2) — TLS & PKI
- Case Study: Deployment Stuck — ImagePull Auth Failure, Vault Secret Rotation (Case Study, L2) — TLS & PKI
- Case Study: SSL Cert Chain Incomplete (Case Study, L1) — TLS & PKI
- Case Study: User Auth Failing — OIDC Cert Expired, Cloud KMS Rotation (Case Study, L2) — TLS & PKI
- Deep Dive: TLS Handshake (deep_dive, L2) — TLS & PKI
- HTTP Protocol (Topic Pack, L0) — TLS & PKI
- Interview: Certificate Expired (Scenario, L2) — TLS & PKI
- Networking Deep Dive (Topic Pack, L1) — TLS & PKI
- Nginx & Web Servers (Topic Pack, L1) — TLS & PKI
Pages that link here¶
- Drills
- HTTP Protocol
- Master Curriculum: 40 Weeks
- Nginx & Web Servers
- Runbook: Certificate Renewal Failed
- Runbook: TLS Certificate Expiry
- Scenario: TLS Certificate Expired
- Symptoms: User Auth Failing, OIDC Cert Expired, Fix Is Cloud KMS Rotation
- TLS & Certificates Ops - Primer
- TLS & PKI - Skill Check
- TLS Handshake Deep Dive
- Track: SRE & Reliability Engineering