HTTP Protocol Footguns¶

Misconfigurations and misunderstandings that cause outages, security holes, and debugging nightmares at the HTTP layer.

1. Confusing no-cache with no-store¶

You set Cache-Control: no-cache on your API responses, thinking this prevents caching. It does not. no-cache means "cache the response but revalidate with the server before using it." Proxies, CDNs, and browsers still store the response. Sensitive data (user profiles, authentication tokens) sits in cache storage on shared infrastructure.

Fix: Use Cache-Control: no-store to prevent caching entirely. Use no-cache when you want caching but with mandatory revalidation. For sensitive data: Cache-Control: no-store, no-cache, must-revalidate, private. For immutable assets with hashed URLs: Cache-Control: public, max-age=31536000, immutable.

2. Automatically Retrying Non-Idempotent Requests¶

Your load balancer or service mesh is configured to retry failed requests. A POST request to /api/orders times out. The LB retries it. The first request actually succeeded (slowly) and the retry also succeeds. The customer is charged twice, two orders are created. POST is not idempotent — repeating it produces a different result.

Fix: Only auto-retry idempotent methods (GET, PUT, DELETE, HEAD, OPTIONS). Never auto-retry POST or PATCH. In nginx: proxy_next_upstream error timeout http_502 only retries on connection failures, not on timeouts by default. In Envoy/Istio, set retry policies to exclude non-idempotent methods. For POST endpoints that must be retryable, implement idempotency keys on the server side.

3. Trusting X-Forwarded-For from Untrusted Sources¶

Your rate limiter reads the client IP from the X-Forwarded-For header. An attacker sets X-Forwarded-For: 1.2.3.4 in their request, bypassing rate limits because the rate limiter throttles 1.2.3.4 instead of the attacker's real IP. Or worse, they set it to an admin IP to bypass IP-based access controls.

Fix: Only trust X-Forwarded-For from known proxy IPs. In nginx, use set_real_ip_from to whitelist trusted proxy ranges and real_ip_header X-Forwarded-For to extract the correct IP. In application code, read the rightmost untrusted IP from the header, not the leftmost. AWS ALB sets X-Forwarded-For reliably, but only if your security groups prevent direct access to the backend.

4. Keepalive Timeout Mismatch Causing Sporadic 502s¶

Your nginx proxy has keepalive_timeout 60s for upstream connections. Your backend (gunicorn, uvicorn, Node.js) has keep-alive: 30s. After 30 seconds of idle, the backend closes the connection. Nginx does not know this and sends the next request down the now-closed socket. The connection is reset. Nginx returns 502 to the client. This happens intermittently under moderate load — just often enough to be infuriating.

Fix: The backend's keepalive timeout must be longer than the proxy's. If nginx keeps connections for 60s, set the backend to 65s or more. In gunicorn: --keep-alive 65. In uvicorn: --timeout-keep-alive 65. In Node.js: server.keepAliveTimeout = 65000. The proxy should always close idle connections before the backend does.

Debug clue: These sporadic 502s have a signature: they happen only on the first request sent on a reused connection after an idle period. If you see 502 errors that cluster around periods of low traffic (nights, weekends) rather than high traffic, this mismatch is almost certainly the cause. Check nginx error logs for upstream prematurely closed connection while reading response header — this confirms the backend closed the connection before nginx expected.

5. Missing Intermediate Certificate in TLS Chain¶

Your SSL certificate works in Chrome on your laptop (Chrome has cached the intermediate cert from a previous visit). It fails on curl, on mobile browsers, and on API clients with: unable to verify the first certificate. The server is sending only the leaf certificate without the intermediate CA certificate.

Fix: Configure your web server to send the full chain (leaf + intermediate, not root):

# Verify chain completeness
openssl s_client -connect api.example.com:443 \
  -servername api.example.com </dev/null 2>&1 \
  | grep "Verify return code"
# "Verify return code: 21" = missing intermediate

# In nginx:
# ssl_certificate should contain leaf + intermediate concatenated:
# cat leaf.crt intermediate.crt > fullchain.crt
# ssl_certificate /etc/nginx/ssl/fullchain.crt;

Test with curl -v (strict validation) and openssl s_client, not just a browser.

6. Redirecting POST to GET with 301/302¶

Your API returns a 301 Moved Permanently redirect for a URL change. HTTP clients that follow the redirect change the method from POST to GET (this is allowed by the spec for 301/302). The POST body is dropped. The redirected request fails because the server expects a POST with a body.

Fix: Use 307 Temporary Redirect or 308 Permanent Redirect — these preserve the HTTP method. 307 is the method-preserving version of 302, and 308 is the method-preserving version of 301. For API redirects, always use 307/308. Reserve 301/302 for browser navigation where method change is acceptable.

7. Not Handling 429 Rate Limits with Retry-After¶

Your automated script calls an API in a tight loop. It hits the rate limit and receives 429 responses with a Retry-After: 60 header. The script ignores the header and immediately retries, generating thousands of 429 responses per second. The API provider blocks your IP entirely. Your integration is down for 24 hours while the block is removed.

Fix: Always check for 429 responses and honor the Retry-After header. Implement exponential backoff:

# Retry-After can be seconds or a date:
# Retry-After: 60
# Retry-After: Wed, 18 Mar 2026 15:00:00 GMT

Set a maximum retry count (e.g., 5 attempts). Log rate limit events as warnings. Pre-emptively check X-RateLimit-Remaining headers and slow down before hitting the limit.

8. CORS Misconfiguration: Reflecting Origin or Using Wildcard with Credentials¶

Your API sets Access-Control-Allow-Origin to whatever the request's Origin header contains (reflecting it). An attacker's site at evil.com can now make authenticated cross-origin requests to your API, stealing user data. Alternatively, you set Access-Control-Allow-Origin: * with Access-Control-Allow-Credentials: true — browsers reject this combination, breaking your legitimate frontend.

Fix: Whitelist specific allowed origins. Never reflect arbitrary origins. Never use * with credentials. Maintain an explicit list:

# Nginx: whitelist approach
map $http_origin $cors_origin {
    "https://dashboard.example.com" $http_origin;
    "https://admin.example.com"     $http_origin;
    default                          "";
}
add_header Access-Control-Allow-Origin $cors_origin;

9. Sending 204 No Content with a Body¶

Your API returns 204 No Content after a successful DELETE, but the response handler accidentally includes a JSON body ({"status": "deleted"}). This works in most clients. Then a strict HTTP parser in a reverse proxy or API gateway strips the body (per HTTP spec, 204 must not have a body). Downstream clients that relied on the body break silently — they get a 204 with no data.

Fix: If you need to return data, use 200 OK with a body. Use 204 No Content only when there is genuinely nothing to return. Audit API responses: curl -s -o /dev/null -w "%{size_download}" -X DELETE https://api/resource/123 — if the download size is > 0 on a 204, fix it.

10. Forgetting SNI When Testing with curl¶

You test HTTPS against a server IP address: curl https://10.0.1.50/health. The server hosts multiple domains via SNI (Server Name Indication). Without a hostname, the TLS handshake does not include SNI, and the server returns the default certificate — which does not match your expected domain. Curl reports a certificate mismatch error. The service is actually fine.

Fix: Always include the hostname when testing HTTPS:

# Correct: let DNS resolve and include hostname
curl https://api.example.com/health

# If you must test against a specific IP:
curl --resolve api.example.com:443:10.0.1.50 https://api.example.com/health
# Or:
curl -H "Host: api.example.com" --connect-to api.example.com:443:10.0.1.50:443 \
  https://api.example.com/health

11. Setting Proxy Timeouts Too Low for Long-Running Endpoints¶

Your nginx proxy_read_timeout is set to 30 seconds (a reasonable default). A report generation endpoint takes 2 minutes. Every report request returns 504. You increase the global timeout to 300 seconds. Now slow backend bugs that should be caught early are masked by the generous timeout, and slow requests tie up proxy workers for 5 minutes each.

Fix: Set timeouts per-endpoint, not globally:

# Global default: aggressive
proxy_read_timeout 30s;

# Override for known slow endpoints
location /api/reports {
    proxy_read_timeout 300s;
}

Better yet, refactor long-running operations to use async patterns: return 202 Accepted with a job ID, let the client poll for results.

12. Not Monitoring Certificate Expiry Until It Expires¶

Your TLS certificate expires at 3 AM on a Saturday. The site goes down. HSTS (Strict-Transport-Security) prevents users from clicking through the browser warning. There is no fallback to HTTP. The on-call engineer scrambles to renew the certificate manually.

Fix: Automate certificate renewal with Let's Encrypt / ACME. Monitor expiry as a first-class metric with alerts at 30, 14, and 7 days:

# Quick check script for monitoring
expiry=$(echo | openssl s_client -connect api.example.com:443 \
  -servername api.example.com 2>/dev/null \
  | openssl x509 -noout -enddate | cut -d= -f2)
days=$(( ($(date -d "$expiry" +%s) - $(date +%s)) / 86400 ))
[ $days -lt 14 ] && echo "CRITICAL: cert expires in $days days"

Add this to Prometheus blackbox exporter or Nagios. Never rely on calendar reminders for certificate renewal.