Skip to content

NAT - Street-Level Ops

Real-world NAT diagnosis and management workflows for production Linux systems.

Task: Set Up Internet Access for a Private Subnet

# Enable IP forwarding
$ sysctl -w net.ipv4.ip_forward=1

# Masquerade outbound traffic from 10.0.0.0/24
$ iptables -t nat -A POSTROUTING -s 10.0.0.0/24 -o eth0 -j MASQUERADE

# Verify from a private host
$ curl -s ifconfig.me
203.0.113.50   # Shows the public IP — NAT is working

# If using a static public IP, prefer SNAT (more efficient):
$ iptables -t nat -A POSTROUTING -s 10.0.0.0/24 -o eth0 \
    -j SNAT --to-source 203.0.113.50

Under the hood: MASQUERADE looks up the outbound interface IP on every packet; SNAT hardcodes it. On a NAT gateway handling thousands of flows, SNAT saves a routing table lookup per packet. Use MASQUERADE only when the outbound IP is dynamic (DHCP, PPPoE).

Task: Port Forward External Traffic to Internal Server

# Forward port 8080 on public IP to internal web server 10.0.0.5:80
$ iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 8080 \
    -j DNAT --to-destination 10.0.0.5:80

# Must also allow in FORWARD chain
$ iptables -A FORWARD -p tcp -d 10.0.0.5 --dport 80 -j ACCEPT

# Enable forwarding if not already
$ sysctl -w net.ipv4.ip_forward=1

# Verify from outside
$ curl http://203.0.113.50:8080
# Should hit 10.0.0.5:80

Task: Diagnose Conntrack Table Exhaustion

# New connections failing intermittently
$ dmesg | tail -20
[94521.789] nf_conntrack: table full, dropping packet
[94522.012] nf_conntrack: table full, dropping packet

# Check usage
$ cat /proc/sys/net/netfilter/nf_conntrack_count
65530
$ cat /proc/sys/net/netfilter/nf_conntrack_max
65536

# Almost full. Immediate fix:
$ sysctl -w net.netfilter.nf_conntrack_max=262144

# Reduce stale connection timeouts
$ sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=30
$ sysctl -w net.netfilter.nf_conntrack_tcp_timeout_established=600

# See what is consuming entries
$ conntrack -L -p tcp --dport 80 | wc -l
42891

# Persist
$ cat > /etc/sysctl.d/99-conntrack.conf <<'EOF'
net.netfilter.nf_conntrack_max = 262144
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 30
net.netfilter.nf_conntrack_tcp_timeout_established = 600
EOF

Scale note: Each conntrack entry consumes approximately 300 bytes of kernel memory. Bumping nf_conntrack_max to 262144 costs ~75 MB of RAM. At 1 million entries, budget ~300 MB. On a busy NAT gateway, monitor nf_conntrack_count as a Prometheus metric and alert at 80% of max.

Remember: Conntrack mnemonic: C-M-T — Count (how many entries), Max (the ceiling), Timeouts (how fast stale entries expire). Tune all three together.

Task: View and Debug Active NAT Translations

# List all NAT rules with packet counts
$ iptables -t nat -L -n -v --line-numbers
Chain PREROUTING (policy ACCEPT 0 packets)
num  pkts bytes target     prot opt in  out  source      destination
1    8423  505K DNAT       tcp  --  eth0 *   0.0.0.0/0   0.0.0.0/0  tcp dpt:8080 to:10.0.0.5:80

Chain POSTROUTING (policy ACCEPT 0 packets)
num  pkts bytes target     prot opt in  out  source      destination
1    125K 7.5M  MASQUERADE all  --  *   eth0 10.0.0.0/24 0.0.0.0/0

# Watch translations in real-time
$ conntrack -E
[NEW] tcp  6 120 SYN_SENT src=10.0.0.5 dst=8.8.8.8 sport=45678 dport=443 [UNREPLIED]
[UPDATE] tcp  6 60 SYN_RECV src=10.0.0.5 dst=8.8.8.8 sport=45678 dport=443
[UPDATE] tcp  6 432000 ESTABLISHED src=10.0.0.5 dst=8.8.8.8 sport=45678 dport=443

# Check a specific connection
$ conntrack -L -s 10.0.0.5 -p tcp --dport 443

Task: Debug Docker NAT Rules

# See Docker's NAT configuration
$ iptables -t nat -L DOCKER -n -v
Chain DOCKER (2 references)
 pkts bytes target prot opt in  out  source      destination
 1234  74K  DNAT   tcp  --  !docker0 * 0.0.0.0/0 0.0.0.0/0  tcp dpt:8080 to:172.17.0.2:80

# See masquerade for outbound container traffic
$ iptables -t nat -L POSTROUTING -n -v | grep MASQ
 45K  2.7M MASQUERADE all --  *   !docker0 172.17.0.0/16 0.0.0.0/0

# Verify container can reach external services
$ docker exec mycontainer curl -s ifconfig.me
203.0.113.50

Task: Set Up SNAT with Multiple Public IPs

# Single public IP running out of source ports (~64K limit)
# Add more public IPs for outbound NAT
$ iptables -t nat -A POSTROUTING -s 10.0.0.0/24 -o eth0 \
    -j SNAT --to-source 203.0.113.1-203.0.113.4

# Kernel round-robins across the IP range
# Effective port space: 4 * 64K = ~256K concurrent connections

# Verify distribution
$ conntrack -L -p tcp | awk '{print $NF}' | sort | uniq -c | sort -rn
  16234 src=203.0.113.1
  15987 src=203.0.113.2
  16102 src=203.0.113.3
  15891 src=203.0.113.4

Task: Flush Conntrack for a Specific Host

# Need to force re-NAT after changing a DNAT rule
# Delete conntrack entries for the old destination
$ conntrack -D -d 10.0.0.5
conntrack v1.4.6 (conntrack-tools): 47 flow entries have been deleted

# New connections will use the updated NAT rule
# Existing connections were using cached translations

# Flush ALL conntrack (use with caution on busy hosts)
$ conntrack -F
conntrack v1.4.6 (conntrack-tools): connection tracking table has been emptied

Gotcha: Flushing conntrack drops ALL tracked connections — every NATted TCP session will see a RST on the next packet because the kernel no longer knows which internal host to forward to. On a busy NAT gateway this causes a thundering herd of reconnections. Always use targeted deletes (conntrack -D -d <ip>) instead of a full flush.

Task: Set Up nftables NAT

# Modern replacement for iptables NAT
$ cat > /etc/nftables-nat.conf <<'EOF'
table ip nat {
    chain prerouting {
        type nat hook prerouting priority -100;
        tcp dport 8080 dnat to 10.0.0.5:80
    }
    chain postrouting {
        type nat hook postrouting priority 100;
        oifname "eth0" masquerade
    }
}
EOF

$ nft -f /etc/nftables-nat.conf
$ nft list table ip nat

Task: Preserve Source IP Through NAT

# Backend servers see the NAT gateway's IP, not the real client
# Option 1: Use PROXY protocol (HAProxy, nginx)
# server config: proxy_protocol on

# Option 2: Use X-Forwarded-For header (HTTP only)
# nginx: proxy_set_header X-Forwarded-For $remote_addr;

# Option 3: Use DSR (Direct Server Return) — no SNAT
# Client -> LB (DNAT only) -> Backend -> Client (directly)
# Backend must have a route back to client without going through LB

# Check what source IP the backend sees
$ conntrack -L -d 10.0.0.5 --dport 80 | head -5
tcp  6 ESTABLISHED src=10.0.0.1 dst=10.0.0.5 sport=45678 dport=80
# src=10.0.0.1 is the NAT gateway, not the real client

Debug clue: If backend logs show the NAT gateway IP for every request instead of real clients, check whether PROXY protocol or X-Forwarded-For is configured. For non-HTTP protocols, PROXY protocol is the only option — it prepends a one-line header with the original client IP before the actual data stream.