NAT - Street-Level Ops¶
Real-world NAT diagnosis and management workflows for production Linux systems.
Task: Set Up Internet Access for a Private Subnet¶
# Enable IP forwarding
$ sysctl -w net.ipv4.ip_forward=1
# Masquerade outbound traffic from 10.0.0.0/24
$ iptables -t nat -A POSTROUTING -s 10.0.0.0/24 -o eth0 -j MASQUERADE
# Verify from a private host
$ curl -s ifconfig.me
203.0.113.50 # Shows the public IP — NAT is working
# If using a static public IP, prefer SNAT (more efficient):
$ iptables -t nat -A POSTROUTING -s 10.0.0.0/24 -o eth0 \
-j SNAT --to-source 203.0.113.50
Under the hood:
MASQUERADElooks up the outbound interface IP on every packet;SNAThardcodes it. On a NAT gateway handling thousands of flows,SNATsaves a routing table lookup per packet. UseMASQUERADEonly when the outbound IP is dynamic (DHCP, PPPoE).
Task: Port Forward External Traffic to Internal Server¶
# Forward port 8080 on public IP to internal web server 10.0.0.5:80
$ iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 8080 \
-j DNAT --to-destination 10.0.0.5:80
# Must also allow in FORWARD chain
$ iptables -A FORWARD -p tcp -d 10.0.0.5 --dport 80 -j ACCEPT
# Enable forwarding if not already
$ sysctl -w net.ipv4.ip_forward=1
# Verify from outside
$ curl http://203.0.113.50:8080
# Should hit 10.0.0.5:80
Task: Diagnose Conntrack Table Exhaustion¶
# New connections failing intermittently
$ dmesg | tail -20
[94521.789] nf_conntrack: table full, dropping packet
[94522.012] nf_conntrack: table full, dropping packet
# Check usage
$ cat /proc/sys/net/netfilter/nf_conntrack_count
65530
$ cat /proc/sys/net/netfilter/nf_conntrack_max
65536
# Almost full. Immediate fix:
$ sysctl -w net.netfilter.nf_conntrack_max=262144
# Reduce stale connection timeouts
$ sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=30
$ sysctl -w net.netfilter.nf_conntrack_tcp_timeout_established=600
# See what is consuming entries
$ conntrack -L -p tcp --dport 80 | wc -l
42891
# Persist
$ cat > /etc/sysctl.d/99-conntrack.conf <<'EOF'
net.netfilter.nf_conntrack_max = 262144
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 30
net.netfilter.nf_conntrack_tcp_timeout_established = 600
EOF
Scale note: Each conntrack entry consumes approximately 300 bytes of kernel memory. Bumping
nf_conntrack_maxto 262144 costs ~75 MB of RAM. At 1 million entries, budget ~300 MB. On a busy NAT gateway, monitornf_conntrack_countas a Prometheus metric and alert at 80% of max.Remember: Conntrack mnemonic: C-M-T — Count (how many entries), Max (the ceiling), Timeouts (how fast stale entries expire). Tune all three together.
Task: View and Debug Active NAT Translations¶
# List all NAT rules with packet counts
$ iptables -t nat -L -n -v --line-numbers
Chain PREROUTING (policy ACCEPT 0 packets)
num pkts bytes target prot opt in out source destination
1 8423 505K DNAT tcp -- eth0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8080 to:10.0.0.5:80
Chain POSTROUTING (policy ACCEPT 0 packets)
num pkts bytes target prot opt in out source destination
1 125K 7.5M MASQUERADE all -- * eth0 10.0.0.0/24 0.0.0.0/0
# Watch translations in real-time
$ conntrack -E
[NEW] tcp 6 120 SYN_SENT src=10.0.0.5 dst=8.8.8.8 sport=45678 dport=443 [UNREPLIED]
[UPDATE] tcp 6 60 SYN_RECV src=10.0.0.5 dst=8.8.8.8 sport=45678 dport=443
[UPDATE] tcp 6 432000 ESTABLISHED src=10.0.0.5 dst=8.8.8.8 sport=45678 dport=443
# Check a specific connection
$ conntrack -L -s 10.0.0.5 -p tcp --dport 443
Task: Debug Docker NAT Rules¶
# See Docker's NAT configuration
$ iptables -t nat -L DOCKER -n -v
Chain DOCKER (2 references)
pkts bytes target prot opt in out source destination
1234 74K DNAT tcp -- !docker0 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8080 to:172.17.0.2:80
# See masquerade for outbound container traffic
$ iptables -t nat -L POSTROUTING -n -v | grep MASQ
45K 2.7M MASQUERADE all -- * !docker0 172.17.0.0/16 0.0.0.0/0
# Verify container can reach external services
$ docker exec mycontainer curl -s ifconfig.me
203.0.113.50
Task: Set Up SNAT with Multiple Public IPs¶
# Single public IP running out of source ports (~64K limit)
# Add more public IPs for outbound NAT
$ iptables -t nat -A POSTROUTING -s 10.0.0.0/24 -o eth0 \
-j SNAT --to-source 203.0.113.1-203.0.113.4
# Kernel round-robins across the IP range
# Effective port space: 4 * 64K = ~256K concurrent connections
# Verify distribution
$ conntrack -L -p tcp | awk '{print $NF}' | sort | uniq -c | sort -rn
16234 src=203.0.113.1
15987 src=203.0.113.2
16102 src=203.0.113.3
15891 src=203.0.113.4
Task: Flush Conntrack for a Specific Host¶
# Need to force re-NAT after changing a DNAT rule
# Delete conntrack entries for the old destination
$ conntrack -D -d 10.0.0.5
conntrack v1.4.6 (conntrack-tools): 47 flow entries have been deleted
# New connections will use the updated NAT rule
# Existing connections were using cached translations
# Flush ALL conntrack (use with caution on busy hosts)
$ conntrack -F
conntrack v1.4.6 (conntrack-tools): connection tracking table has been emptied
Gotcha: Flushing conntrack drops ALL tracked connections — every NATted TCP session will see a RST on the next packet because the kernel no longer knows which internal host to forward to. On a busy NAT gateway this causes a thundering herd of reconnections. Always use targeted deletes (
conntrack -D -d <ip>) instead of a full flush.
Task: Set Up nftables NAT¶
# Modern replacement for iptables NAT
$ cat > /etc/nftables-nat.conf <<'EOF'
table ip nat {
chain prerouting {
type nat hook prerouting priority -100;
tcp dport 8080 dnat to 10.0.0.5:80
}
chain postrouting {
type nat hook postrouting priority 100;
oifname "eth0" masquerade
}
}
EOF
$ nft -f /etc/nftables-nat.conf
$ nft list table ip nat
Task: Preserve Source IP Through NAT¶
# Backend servers see the NAT gateway's IP, not the real client
# Option 1: Use PROXY protocol (HAProxy, nginx)
# server config: proxy_protocol on
# Option 2: Use X-Forwarded-For header (HTTP only)
# nginx: proxy_set_header X-Forwarded-For $remote_addr;
# Option 3: Use DSR (Direct Server Return) — no SNAT
# Client -> LB (DNAT only) -> Backend -> Client (directly)
# Backend must have a route back to client without going through LB
# Check what source IP the backend sees
$ conntrack -L -d 10.0.0.5 --dport 80 | head -5
tcp 6 ESTABLISHED src=10.0.0.1 dst=10.0.0.5 sport=45678 dport=80
# src=10.0.0.1 is the NAT gateway, not the real client
Debug clue: If backend logs show the NAT gateway IP for every request instead of real clients, check whether PROXY protocol or
X-Forwarded-Foris configured. For non-HTTP protocols, PROXY protocol is the only option — it prepends a one-line header with the original client IP before the actual data stream.