Nginx & Web Servers - Street-Level Ops¶
Quick Diagnosis Commands¶
When Nginx is misbehaving, start here:
# 1. Is Nginx running?
systemctl status nginx
ps aux | grep nginx
# 2. Test config syntax (ALWAYS do this before reload)
nginx -t
# 3. Check error log (real-time)
tail -f /var/log/nginx/error.log
# 4. Check access log for status codes
awk '{print $9}' /var/log/nginx/access.log | sort | uniq -c | sort -rn
# 5. Check which ports Nginx is listening on
ss -tlnp | grep nginx
# 6. Show current connections
ss -an | grep :80 | awk '{print $1}' | sort | uniq -c
# 7. Check upstream backend health
curl -I http://localhost/health
# 8. Show compiled-in modules
nginx -V 2>&1 | tr -- '--' '\n' | grep module
Pattern: Debugging 502 Bad Gateway¶
502 means Nginx connected to the upstream but got an invalid response (or the upstream died mid-response). Systematic debugging:
# 1. Is the backend process running?
systemctl status myapp
ss -tlnp | grep 8080
# 2. Can Nginx reach the backend directly?
curl -v http://127.0.0.1:8080/
# 3. Check Nginx error log for the specific error
tail -100 /var/log/nginx/error.log | grep 502
# Common messages:
# "upstream prematurely closed connection"
# "connect() failed (111: Connection refused)"
# "no live upstreams"
# 4. Check backend logs
journalctl -u myapp --since "5 minutes ago"
# 5. Check if SELinux is blocking the proxy connection
# (common on RHEL/CentOS)
getenforce
ausearch -m avc --ts recent | grep nginx
setsebool -P httpd_can_network_connect 1 # if SELinux is the issue
# 6. Check file descriptor limits
cat /proc/$(cat /var/run/nginx.pid)/limits | grep "Max open"
Common 502 Causes¶
Backend is down -> restart backend
Backend is slow (timeout) -> increase proxy_read_timeout
SELinux blocking connections -> setsebool httpd_can_network_connect
Socket permissions wrong -> check unix socket owner/perms
Backend crashing under load -> check backend logs, memory
Upstream keepalive misconfigured -> ensure proxy_http_version 1.1
Pattern: Debugging 504 Gateway Timeout¶
504 means Nginx timed out waiting for the upstream to respond.
# Check current timeout settings
grep -r 'proxy_.*timeout' /etc/nginx/
# Increase timeouts (in the relevant location block)
# proxy_connect_timeout 60s; # time to establish connection
# proxy_send_timeout 60s; # time between successive writes
# proxy_read_timeout 300s; # time between successive reads (key one)
If you are increasing proxy_read_timeout past 60 seconds, the real fix is probably making the backend faster — not waiting longer.
Debug clue: A 504 that appears only on some requests (not all) usually means one of several upstream backends is slow. Enable
upstream_response_timein your log format and correlate the 504s with a specific upstream IP -- the slow backend will stand out.
Pattern: Reload vs Restart¶
# RELOAD: graceful, zero-downtime
# - Master process reads new config
# - Spawns new workers with new config
# - Old workers finish existing requests, then exit
nginx -s reload
systemctl reload nginx
# RESTART: drops all connections
# - Process stops, then starts fresh
# - Active connections are killed
systemctl restart nginx
# REOPEN: rotate log files without reload
nginx -s reopen
Always prefer reload. The only time you need restart is when changing settings that require a full process restart (rare — listen socket changes, module loading).
Gotcha: The "if" Directive Is Evil¶
if inside a location block creates an implicit nested location, and directives from the parent may not apply:
# BROKEN: may cause unexpected behavior
location / {
set $redirect 0;
if ($http_x_forwarded_proto != "https") {
set $redirect 1;
}
if ($redirect = 1) {
return 301 https://$host$request_uri;
}
proxy_pass http://backend; # may not execute inside if
}
# CORRECT: use map + return
map $http_x_forwarded_proto $redirect_to_https {
default 0;
"http" 1;
}
server {
if ($redirect_to_https) {
return 301 https://$host$request_uri;
}
# ...
}
Safe uses of if: return and rewrite inside server context. Anything else — use map or try_files.
Gotcha: Location Matching Order Surprises¶
# Quiz: which location handles /static/image.jpg?
location / { # prefix (lowest priority)
proxy_pass http://backend;
}
location /static/ { # prefix
root /var/www;
}
location ~ \.(jpg|png)$ { # regex
expires 30d;
}
Answer: the regex ~ \.(jpg|png)$ wins, because regex locations beat prefix locations (unless the prefix uses ^~).
Gotcha: proxy_pass Trailing Slash¶
# Scenario 1: NO trailing slash
location /app/ {
proxy_pass http://backend;
}
# /app/page -> backend receives: /app/page
# Scenario 2: WITH trailing slash
location /app/ {
proxy_pass http://backend/;
}
# /app/page -> backend receives: /page (path stripped!)
# Scenario 3: With a different path
location /app/ {
proxy_pass http://backend/v2/;
}
# /app/page -> backend receives: /v2/page
A single / completely changes routing behavior. This is the source of countless misrouted requests.
War story: A team spent two days debugging why their API returned HTML error pages. The root cause:
proxy_pass http://backend/(trailing slash) stripped the/api/v2/prefix, so the backend received bare paths like/usersinstead of/api/v2/usersand returned its default 404 page. One character.
Pattern: Log Analysis¶
# Top 20 URLs by request count
awk '{print $7}' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head -20
# Response codes per hour
awk '{print $4, $9}' /var/log/nginx/access.log | cut -d: -f1-2 | sort | uniq -c
# Slow requests (if using $request_time in log format)
awk '$NF > 2.0 {print $NF, $7}' /var/log/nginx/access.log | sort -rn | head -20
# 5xx errors with full details
awk '$9 ~ /^5/' /var/log/nginx/access.log | tail -50
# Requests per second (rough)
awk '{print $4}' /var/log/nginx/access.log | cut -d: -f1-3 | uniq -c | sort -rn | head -10
# Cache hit ratio (if X-Cache-Status header is logged)
awk '{print $NF}' /var/log/nginx/access.log | sort | uniq -c
Custom Log Format for Better Analysis¶
log_format detailed '$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent" '
'$request_time $upstream_response_time '
'$upstream_cache_status';
access_log /var/log/nginx/access.log detailed;
Gotcha: Buffer Size Misconfigs¶
# Default proxy buffers are often too small for apps that set large headers
# (e.g., big cookies, long JWT tokens)
# Symptoms: 502 errors with "upstream sent too big header" in error log
# This is one of the most common Nginx 502 causes in apps that use OAuth/JWT
# Fix:
proxy_buffer_size 16k; # for response headers
proxy_buffers 4 32k; # for response body
proxy_busy_buffers_size 64k; # max size for busy buffers
# For large client request headers:
large_client_header_buffers 4 16k;
# For large client request bodies (file uploads):
client_max_body_size 100m;
Pattern: Rate Limiting in Practice¶
# Define zone in http context
limit_req_zone $binary_remote_addr zone=login:10m rate=5r/s;
# Apply in location
location /login {
limit_req zone=login burst=10 nodelay;
limit_req_status 429;
proxy_pass http://backend;
}
# Test rate limiting
for i in $(seq 1 20); do
curl -s -o /dev/null -w "%{http_code}\n" http://localhost/login
done
# Should see 200s then 429s
Pattern: Quick SSL Setup with Let's Encrypt¶
# Install certbot
apt install certbot python3-certbot-nginx # Debian/Ubuntu
yum install certbot python3-certbot-nginx # RHEL/CentOS
# Obtain and auto-configure SSL
certbot --nginx -d example.com -d www.example.com
# Test auto-renewal
certbot renew --dry-run
# Manual renewal
certbot renew
# Check certificate expiry
echo | openssl s_client -connect example.com:443 2>/dev/null | openssl x509 -noout -dates
Gotcha: add_header Does Not Inherit¶
server {
add_header X-Frame-Options "SAMEORIGIN";
location /api/ {
add_header X-Api-Version "v2";
# X-Frame-Options is NOT set here!
# add_header in a child block replaces ALL parent add_header directives
proxy_pass http://backend;
}
}
If you use add_header in a location block, all add_header directives from parent contexts are dropped. You must repeat them.
# Fix: repeat all headers, or use the headers-more module
location /api/ {
add_header X-Frame-Options "SAMEORIGIN";
add_header X-Api-Version "v2";
proxy_pass http://backend;
}