Portal | Level: L1: Foundations | Topics: TLS & Certificates Ops, TLS & PKI | Domain: Security
TLS & Certificates Ops - Primer¶
Why This Matters¶
TLS is the encryption layer that protects every HTTPS connection, every API call, every service-to-service communication in modern infrastructure. When TLS works, nobody thinks about it. When a certificate expires at 3 AM on a Saturday, everything breaks — browsers show scary warnings, API clients refuse connections, webhooks stop firing, and your monitoring system (which also uses TLS) might fail to alert you about it.
Certificate management is one of the most common causes of production outages. It is entirely preventable with the right automation and monitoring, but it requires understanding the full chain: how certificates are issued, how they are validated, how they expire, and how to debug failures when the chain breaks.
The TLS Handshake¶
When a client connects to a TLS-enabled server, they negotiate a secure channel before any application data is exchanged.
Client Server
| |
|--- ClientHello -------------------------→|
| (TLS version, supported ciphers, |
| SNI: app.example.com) |
| |
|←--- ServerHello ------------------------|
| (chosen cipher, session ID) |
| |
|←--- Certificate ------------------------|
| (server cert + intermediate chain) |
| |
| [Client verifies certificate chain |
| against its trusted CA store] |
| |
|--- Key Exchange (ECDHE) ----------------→|
| |
| [Both sides derive session keys |
| from the key exchange material] |
| |
|←========= Encrypted Traffic ============→|
TLS 1.3 simplified this to a single round-trip (1-RTT), or even zero round-trips for resumed connections (0-RTT). TLS 1.2 requires two round-trips. TLS 1.0 and 1.1 are deprecated and should be disabled.
Timeline: TLS 1.3 was published as RFC 8446 in August 2018 after 28 drafts over 4 years — one of the most reviewed RFCs in IETF history. It removed all legacy algorithms (RSA key exchange, CBC ciphers, RC4, SHA-1) in a single version bump, which is why it was such a long process.
SNI (Server Name Indication)¶
SNI is a TLS extension that lets the client tell the server which hostname it is connecting to during the handshake. This is essential for hosting multiple TLS-enabled sites on a single IP address. Without SNI, the server does not know which certificate to present until after the TLS handshake, which is too late.
# The -servername flag sends the SNI extension
openssl s_client -connect 203.0.113.50:443 -servername app.example.com
# Without SNI, you may get the wrong certificate or a default cert
openssl s_client -connect 203.0.113.50:443
Some legacy clients (old Java versions, old cURL builds, certain IoT devices) do not send SNI. If you serve multiple domains on one IP and a client does not send SNI, the server returns its default certificate, which may not match the requested domain.
The Certificate Chain¶
Certificates form a trust chain from the server's leaf certificate up to a root CA that the client trusts.
Root CA (self-signed, pre-installed in OS/browser trust stores)
|
+-- Intermediate CA (signed by Root CA)
|
+-- Leaf Certificate (signed by Intermediate CA)
Subject: app.example.com
SAN: app.example.com, api.example.com
Valid: 2026-01-01 to 2026-04-01
Issuer: Let's Encrypt R3
The server must send the leaf certificate and all intermediate certificates. The root CA certificate is NOT sent — clients already have it in their trust store. If you send only the leaf without the intermediate, some clients will fail to build the chain and reject the connection.
Why Browsers Work But API Clients Fail¶
Browsers are forgiving. Chrome and Firefox cache intermediate certificates and can often complete a chain even if the server does not send it. cURL, Python requests, Java HttpClient, and most API clients are strict — if the server does not send the full chain, they fail with "unable to verify certificate."
This is the most common TLS debugging trap: "it works in the browser" does not mean the chain is correct.
Debug clue: The fastest way to check if a server sends the full certificate chain:
openssl s_client -connect host:443 -servername host </dev/null 2>/dev/null | grep -c 'Certificate chain'. Then count the entries withgrep -c 's:'. If you see only the leaf cert (1 entry) and no intermediate, the chain is incomplete. Browsers will work; cURL and API clients will fail.
X.509 Certificate Fields¶
| Field | Purpose | Example |
|---|---|---|
| Subject (CN) | Primary identity (legacy) | CN=app.example.com |
| Subject Alternative Name (SAN) | All valid identities (current standard) | DNS:app.example.com, DNS:api.example.com, IP:10.0.2.100 |
| Issuer | Who signed this certificate | CN=R3, O=Let's Encrypt |
| Not Before / Not After | Validity period | 2026-01-01 00:00:00 / 2026-04-01 00:00:00 |
| Serial Number | Unique identifier from the CA | 03:a1:b2:c3... |
| Key Usage | Permitted cryptographic operations | Digital Signature, Key Encipherment |
| Extended Key Usage | Permitted application purposes | TLS Web Server Authentication, TLS Web Client Authentication |
| Basic Constraints | Whether this is a CA certificate | CA:FALSE (leaf) or CA:TRUE (CA cert) |
SAN is the standard. Modern TLS libraries check SAN first. The CN field is only checked if SAN is absent. Always include all hostnames and IPs in the SAN field.
Certificate Types¶
DV (Domain Validation): Proves you control the domain. Issued automatically by Let's Encrypt, Cloudflare, etc. Sufficient for most operational purposes. Validation methods: HTTP-01, DNS-01, TLS-ALPN-01.
OV (Organization Validation): Proves domain control plus organizational identity. Requires manual verification by the CA. Used by enterprises for public-facing services.
EV (Extended Validation): Stricter organizational verification. Used to show a green bar in browsers (most browsers no longer display this differently from DV). Rarely worth the cost for ops purposes.
Key Types for TLS¶
| Algorithm | Common Sizes | Performance | Compatibility | Recommendation |
|---|---|---|---|---|
| RSA | 2048, 4096 | Slower (especially 4096) | Universal | 2048 minimum, use for maximum compatibility |
| ECDSA | P-256, P-384 | Faster handshakes, smaller keys | Modern clients (post-2015) | Preferred for new deployments |
| Ed25519 | Fixed 256-bit | Fastest | Limited TLS support (not in all stacks yet) | Use for SSH, not yet universal for TLS |
# Generate RSA 2048 key + CSR
openssl req -new -newkey rsa:2048 -nodes -keyout server.key -out server.csr \
-subj "/CN=app.example.com"
# Generate ECDSA P-256 key + CSR
openssl ecparam -genkey -name prime256v1 -out server.key
openssl req -new -key server.key -out server.csr \
-subj "/CN=app.example.com"
# Generate CSR with SAN (required for multi-domain certs)
openssl req -new -newkey rsa:2048 -nodes -keyout server.key -out server.csr \
-subj "/CN=app.example.com" \
-addext "subjectAltName=DNS:app.example.com,DNS:api.example.com,DNS:www.example.com"
Let's Encrypt and the ACME Protocol¶
Let's Encrypt provides free, automated DV certificates using the ACME (Automated Certificate Management Environment) protocol.
Challenge Types¶
HTTP-01: The CA gives you a token. You serve it at http://<domain>/.well-known/acme-challenge/<token>. The CA fetches it to prove you control the domain. Requires port 80 to be reachable from the internet.
DNS-01: The CA gives you a TXT record value. You create _acme-challenge.<domain> TXT <value>. The CA queries DNS to verify. Works for wildcard certificates. Does not require inbound internet access to your server — only DNS API access.
TLS-ALPN-01: Proves control by presenting a self-signed certificate with a specific ACME extension on port 443. Less commonly used.
Rate Limits¶
Let's Encrypt enforces rate limits to prevent abuse: - 50 certificates per registered domain per week - 5 duplicate certificates per week - 300 new orders per account per 3 hours - Failed validation limit: 5 failures per hostname per hour
Always test with the staging environment first: https://acme-staging-v02.api.letsencrypt.org/directory. Staging has much higher rate limits and issues fake (untrusted) certificates.
Cipher Suites¶
A cipher suite defines the algorithms used for each phase of the TLS connection:
| Component | Purpose | Examples |
|---|---|---|
| Key Exchange | Establish shared secret | ECDHE, DHE, RSA (static — avoid) |
| Authentication | Verify server (and client) identity | RSA, ECDSA |
| Encryption | Encrypt application data | AES-128-GCM, AES-256-GCM, ChaCha20-Poly1305 |
| MAC | Message integrity (AEAD ciphers include this) | SHA256, SHA384 |
Example cipher suite name: TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
Recommended Configuration¶
# TLS 1.3 ciphers (selected automatically, cannot be misconfigured)
TLS_AES_256_GCM_SHA384
TLS_CHACHA20_POLY1305_SHA256
TLS_AES_128_GCM_SHA256
# TLS 1.2 ciphers (must be configured explicitly)
ECDHE-RSA-AES256-GCM-SHA384
ECDHE-RSA-AES128-GCM-SHA256
ECDHE-ECDSA-AES256-GCM-SHA384
ECDHE-ECDSA-AES128-GCM-SHA256
Avoid: RC4, 3DES, CBC mode ciphers, static RSA key exchange, any cipher without forward secrecy. Use Mozilla's SSL Configuration Generator (ssl-config.mozilla.org) for per-server recommended settings.
Common TLS Errors Quick Reference¶
| Error | Meaning | Fix |
|---|---|---|
x509: certificate has expired |
Cert not renewed | Renew cert, check cert-manager logs, fix Issuer |
x509: certificate signed by unknown authority |
CA not in trust store | Add CA cert to system trust store or pass --cacert |
x509: cannot validate certificate for X because it doesn't contain any IP SANs |
Missing SAN | Add IP address to certificate SAN field |
tls: handshake failure |
Protocol or cipher mismatch | Check TLS version compatibility, cipher overlap |
NET::ERR_CERT_COMMON_NAME_INVALID |
CN/SAN does not match hostname | Fix certificate DNS names to match the requested hostname |
unable to verify the first certificate |
Missing intermediate cert | Concatenate leaf + intermediate into fullchain |
Certificate Formats¶
| Format | Extension | Description | Common Use |
|---|---|---|---|
| PEM | .pem, .crt, .key | Base64-encoded, ASCII text, -----BEGIN CERTIFICATE----- |
Linux, Nginx, Apache, most tools |
| DER | .der, .cer | Binary encoding | Windows, Java (sometimes) |
| PKCS#12 / PFX | .p12, .pfx | Binary bundle (cert + key + chain) | Windows, Java, import/export |
| JKS | .jks | Java KeyStore (legacy) | Java applications |
Converting Between Formats¶
# PEM to DER
openssl x509 -in cert.pem -outform DER -out cert.der
# DER to PEM
openssl x509 -in cert.der -inform DER -out cert.pem
# PEM cert + key to PKCS#12
openssl pkcs12 -export -in cert.pem -inkey key.pem -out cert.p12 \
-certfile chain.pem -name "app.example.com"
# PKCS#12 to PEM
openssl pkcs12 -in cert.p12 -out all.pem -nodes
# Extract just the cert:
openssl pkcs12 -in cert.p12 -nokeys -out cert.pem
# Extract just the key:
openssl pkcs12 -in cert.p12 -nocerts -nodes -out key.pem
# JKS to PKCS#12 (Java keytool)
keytool -importkeystore -srckeystore keystore.jks -destkeystore keystore.p12 \
-deststoretype PKCS12
OpenSSL Command Reference¶
# View certificate details from a file
openssl x509 -in cert.pem -noout -text
# View specific fields
openssl x509 -in cert.pem -noout -subject -issuer -dates -serial
# View SAN (Subject Alternative Names)
openssl x509 -in cert.pem -noout -ext subjectAltName
# Verify a certificate chain
openssl verify -CAfile ca-bundle.pem -untrusted intermediate.pem leaf.pem
# Check if key matches certificate
openssl x509 -noout -modulus -in cert.pem | md5sum
openssl rsa -noout -modulus -in key.pem | md5sum
# If md5sums match, the key matches the cert
# Check a remote server's certificate
openssl s_client -connect app.example.com:443 -servername app.example.com </dev/null 2>/dev/null | \
openssl x509 -noout -text
# Generate a self-signed certificate (for testing only)
openssl req -x509 -newkey rsa:2048 -nodes -keyout test.key -out test.crt \
-days 365 -subj "/CN=test.local" \
-addext "subjectAltName=DNS:test.local,DNS:localhost,IP:127.0.0.1"
cert-manager in Kubernetes¶
cert-manager automates certificate lifecycle in Kubernetes: requesting, issuing, storing, and renewing certificates.
Architecture¶
ClusterIssuer/Issuer → defines how to obtain certificates (ACME, internal CA, Vault, etc.)
|
Certificate CR → declares what certificate you want (DNS names, duration, issuer)
|
cert-manager controller → watches Certificate CRs, interacts with CA, creates Secrets
|
kubernetes.io/tls Secret → stores the issued cert (tls.crt) and private key (tls.key)
|
Ingress / Pod → references the Secret for TLS termination
Issuers¶
# Let's Encrypt production (HTTP-01 challenge)
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: ops@example.com
privateKeySecretRef:
name: letsencrypt-prod-key
solvers:
- http01:
ingress:
class: nginx
---
# Let's Encrypt with DNS-01 (for wildcards)
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-dns
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: ops@example.com
privateKeySecretRef:
name: letsencrypt-dns-key
solvers:
- dns01:
route53:
region: us-east-1
hostedZoneID: Z123456
Requesting a Certificate¶
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: app-tls
namespace: production
spec:
secretName: app-tls-secret
duration: 2160h # 90 days
renewBefore: 720h # Renew 30 days before expiry
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
dnsNames:
- app.example.com
- api.example.com
privateKey:
algorithm: ECDSA
size: 256
Ingress Annotation Shortcut¶
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
tls:
- hosts:
- app.example.com
secretName: app-tls-secret
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: app
port:
number: 8000
cert-manager detects the annotation, creates a Certificate resource automatically, issues the cert, stores it in the Secret, and renews it before expiry.
Certificate Transparency Logs¶
Certificate Transparency (CT) is a public audit log of all certificates issued by participating CAs. When a CA issues a certificate, it submits it to multiple CT logs. This allows domain owners to detect misissued certificates.
# Check CT logs for your domain
# Use crt.sh (web interface or API):
curl -s "https://crt.sh/?q=%.example.com&output=json" | \
jq '.[] | {id, common_name: .common_name, not_before, not_after, issuer_name}'
# Monitor for unexpected certificates
# Set up alerts via crt.sh, Certspotter, or Facebook's CT monitoring tool
HSTS (HTTP Strict Transport Security)¶
HSTS tells browsers to only connect via HTTPS. Once a browser receives the HSTS header, it will refuse HTTP connections to that domain for the specified max-age.
max-age=31536000: Browser remembers for 1 yearincludeSubDomains: Applies to all subdomainspreload: Domain can be submitted to browser preload lists (hardcoded HTTPS-only)
Gotcha: HSTS preloading is a one-way door. Once your domain is in Chrome's preload list (hardcoded into the browser binary, shared by all major browsers), removal takes months. Test with short
max-agevalues first.
Warning: HSTS with preload is very difficult to undo. If you later need to serve HTTP (during a cert outage, for example), preloaded HSTS prevents it. Start with a short max-age (300 seconds) and increase gradually.
OCSP Stapling¶
OCSP (Online Certificate Status Protocol) lets clients check whether a certificate has been revoked. Without stapling, the client contacts the CA's OCSP responder on every TLS connection, adding latency and creating a privacy concern (the CA sees which sites you visit).
With OCSP stapling, the server periodically fetches the OCSP response from the CA and includes it (staples it) in the TLS handshake. The client gets revocation status without contacting the CA.
# Nginx OCSP stapling configuration
ssl_stapling on;
ssl_stapling_verify on;
ssl_trusted_certificate /etc/nginx/chain.pem; # Intermediate + root
resolver 8.8.8.8 1.1.1.1 valid=300s;
resolver_timeout 5s;
# Verify OCSP stapling is working
openssl s_client -connect app.example.com:443 -servername app.example.com -status </dev/null 2>/dev/null | \
grep -A 5 "OCSP Response"
# Should show "OCSP Response Status: successful"
mTLS (Mutual TLS)¶
Standard TLS is one-way: the client verifies the server's certificate. mTLS is two-way: both sides present and verify certificates. This is used for service-to-service authentication.
Standard TLS:
Client verifies server cert → Server is who it claims to be
Server does NOT verify client → Any client can connect
mTLS:
Client verifies server cert → Server is who it claims to be
Server verifies client cert → Client is who it claims to be
Both sides authenticated
Use Cases¶
- Service mesh (Istio, Linkerd) — automatic mTLS between pods
- API authentication — client certificates instead of API keys
- Zero-trust networking — every connection is authenticated
- Database connections — PostgreSQL, MySQL support client certs
# Test mTLS connection
openssl s_client -connect api.example.com:443 \
-cert client.crt -key client.key -CAfile ca.crt
# curl with client cert
curl --cert client.crt --key client.key --cacert ca.crt \
https://api.example.com/resource
Certificate Pinning¶
Certificate pinning hardcodes the expected certificate (or public key) in the client. If the server presents a different certificate — even one signed by a trusted CA — the client rejects it. This prevents MITM attacks using fraudulently issued certificates.
Warning: Certificate pinning is operationally dangerous. If you need to rotate the pinned certificate and forget to update the pinned value in all clients, those clients permanently lose connectivity. Mobile apps with pinning require an app store update to fix.
Modern recommendation: use Certificate Transparency monitoring instead of pinning. If you must pin, pin the public key (not the certificate) and include backup pins.
Wildcard Certificates¶
A wildcard certificate (*.example.com) covers any single-level subdomain: app.example.com, api.example.com, www.example.com.
It does NOT cover:
- The apex domain (example.com — requires a separate SAN entry)
- Multi-level subdomains (staging.app.example.com — needs *.app.example.com)
# Request a wildcard cert with Let's Encrypt (DNS-01 required)
certbot certonly --dns-route53 \
-d "*.example.com" \
-d "example.com"
Including both *.example.com and example.com in the same certificate is best practice for wildcard certs.
Wiki Navigation¶
Prerequisites¶
- Networking Deep Dive (Topic Pack, L1)
Next Steps¶
- Skillcheck: TLS & PKI (Assessment, L2)
- TLS & PKI Drills (Drill, L2)
- cert-manager (Topic Pack, L1)
Related Content¶
- Case Study: BMC Clock Skew Cert Failure (Case Study, L2) — TLS & PKI
- Case Study: DNS Looks Broken — TLS Expired, Fix Is Cert-Manager (Case Study, L2) — TLS & PKI
- Case Study: Deployment Stuck — ImagePull Auth Failure, Vault Secret Rotation (Case Study, L2) — TLS & PKI
- Case Study: SSL Cert Chain Incomplete (Case Study, L1) — TLS & PKI
- Case Study: User Auth Failing — OIDC Cert Expired, Cloud KMS Rotation (Case Study, L2) — TLS & PKI
- Deep Dive: TLS Handshake (deep_dive, L2) — TLS & PKI
- HTTP Protocol (Topic Pack, L0) — TLS & PKI
- Interview: Certificate Expired (Scenario, L2) — TLS & PKI
- Networking Deep Dive (Topic Pack, L1) — TLS & PKI
- Nginx & Web Servers (Topic Pack, L1) — TLS & PKI
Pages that link here¶
- Anti-Primer: TLS Certificates Ops
- Certification Prep: AWS SAA — Solutions Architect Associate
- Certification Prep: CKA — Certified Kubernetes Administrator
- Certification Prep: CKS — Certified Kubernetes Security Specialist
- Comparison: Ingress Controllers
- HTTP Protocol
- Incident Replay: BMC Clock Skew Causes Certificate Failure
- Incident Replay: Time Sync Skew Breaks Application
- Master Curriculum: 40 Weeks
- Nginx & Web Servers
- Production Readiness Review: Answer Key
- Production Readiness Review: Study Plans
- Runbook: Certificate Renewal Failed
- Runbook: TLS Certificate Expiry
- Scenario: DNS Resolves Correctly but Application Fails to Connect