DHCP & IP Address Management Footguns¶
- Overlapping DHCP scopes between servers. Two DHCP servers are configured with overlapping address ranges. Both hand out 10.1.1.50. Two devices get the same IP. One or both experience intermittent connectivity as ARP tables flip between MAC addresses. The problem is maddening to debug because it comes and goes.
Fix: When running multiple DHCP servers, split the range with zero overlap. Server A gets .11-.130, Server B gets .131-.200. Or use proper failover protocols (ISC DHCP failover, Windows DHCP failover) that coordinate lease state between servers.
- No relay agent configured on remote VLANs.
You set up a centralized DHCP server but forget to configure
ip helper-addresson the router interfaces for remote VLANs. DHCP broadcasts from those VLANs never reach the server. Clients on those VLANs get 169.254.x.x (APIPA) addresses and cannot communicate beyond their local segment.
Fix: Configure a relay agent (ip helper-address on Cisco, dhcrelay on Linux) on every Layer 3 interface that needs DHCP. Point it at both primary and secondary DHCP servers. Verify with tcpdump on the server side that relayed requests arrive.
- Short lease times on stable networks. You set 30-minute leases on a server VLAN with 200 machines. The DHCP server handles 400 renewals per hour just for one VLAN. If the server goes down for 45 minutes, every server on that VLAN loses its IP and drops off the network. Services go down in a cascading failure.
Fix: Match lease time to the environment's volatility. Server VLANs: 7-30 days. Office workstations: 8-24 hours. Guest WiFi: 1-4 hours. The lease should always be at least 2x your expected maximum DHCP server downtime.
Under the hood: DHCP clients renew at 50% of the lease time (T1) and rebind at 87.5% (T2). A 30-minute lease means renewal attempts every 15 minutes. If the server is unreachable, the client retries at T2 (26.25 minutes) and then releases the IP when the full lease expires. With 200 clients on a 30-minute lease, your server handles ~400 DHCPREQUEST packets per hour just for renewals — on top of new DISCOVERs.
- Exhausted address pools with no monitoring. The DHCP pool for a VLAN has 190 addresses. Over months, the environment grows to 185 active devices. Nobody monitors pool utilization. The 191st device gets no IP. Help desk tickets trickle in as "network is down" with no obvious cause.
Fix: Monitor pool utilization with dhcpd-pools, Kea's REST API, or SNMP from
your DHCP server. Alert at 80% utilization. Plan subnet expansions before you hit
the wall. Clean up stale leases from decommissioned devices.
- Rogue DHCP server on the network. Someone plugs in a consumer router or a VM with DHCP enabled. It starts answering DISCOVER messages faster than your real server. Clients get wrong gateway, wrong DNS, or addresses from the wrong subnet. Some devices work, some do not, depending on which server responded first.
Fix: Enable DHCP snooping on your switches. Mark only the ports connected to legitimate DHCP servers as trusted. All other ports that send DHCP server messages get blocked. This is a switch-level control and the only reliable prevention.
Debug clue: To find a rogue DHCP server, run
tcpdump -i eth0 -n port 67 or port 68on any affected client VLAN. Legitimate DHCP offers come from your known server IPs. Any offer from an unknown source IP is the rogue.nmap --script broadcast-dhcp-discoveralso reveals all responding DHCP servers on the broadcast domain.
- Static IP assignments inside the DHCP dynamic range. An admin statically assigns 10.1.1.50 on a server. That address is inside the DHCP dynamic pool but not excluded or reserved. DHCP eventually hands 10.1.1.50 to another device. IP conflict. The server's connections drop randomly.
Fix: Never assign static IPs from within the DHCP dynamic range. Either use DHCP reservations (bind MAC to IP in the DHCP config) or place static assignments in an excluded range that the DHCP server will never touch.
- Forgetting the subnet declaration for relayed networks.
The DHCP server has a relay forwarding from VLAN 200, but the server config has no
subnetblock for that VLAN's network. The server receives the relayed DISCOVER, cannot match it to a scope (using giaddr), and silently drops it. No error on the client, no log entry unless you look carefully.
Fix: For every VLAN that uses a relay agent, the DHCP server must have a corresponding subnet declaration — even if the pool is defined within a shared- network. Test each relay path after configuration changes.
- Not backing up lease files before server maintenance. You upgrade the DHCP server or move it to new hardware. The lease file is lost or corrupted. On restart, the server has no memory of existing leases. It starts handing out addresses from the beginning of the pool. Every renewal fails because the server does not recognize the lease. Mass IP reassignment causes brief outages across the network.
Fix: Back up lease files before any server maintenance:
cp /var/lib/dhcp/dhcpd.leases /var/lib/dhcp/dhcpd.leases.backup. For Kea with a
database backend, dump the database. After migration, restore the lease state before
starting the new server.
- PXE boot options pointing to the wrong TFTP server. DHCP options 66 (TFTP server) and 67 (boot filename) are configured globally instead of per-scope. A workstation VLAN sends PXE boot options to every laptop. Laptops that netboot by default try to PXE instead of booting from disk, or they download the wrong image and get wiped.
Fix: Scope PXE options to only the VLANs/classes that need them. Use DHCP vendor classes or client classes to match PXE clients specifically. Never set boot options globally unless every device on every VLAN should PXE boot.
-
Trusting the IPAM spreadsheet. The "source of truth" for IP assignments is a shared spreadsheet. It was accurate six months ago. Since then, 30 IPs were reassigned without updating it, a subnet was expanded, and two entries conflict. You assign an IP from the spreadsheet and cause a conflict because the sheet lied.
Fix: Use a real IPAM tool (NetBox, Infoblox, phpIPAM). Integrate it with your DHCP server so assignments are automatically recorded. Run periodic reconciliation scans (nmap, arp-scan) to compare actual network state against IPAM records. The tool is the source of truth, not a human-edited document.
Gotcha: Even with a proper IPAM tool, reconciliation must be scheduled and reviewed. NetBox can import from DHCP servers and network scans, but if nobody reviews the "stale" and "conflict" reports, you've just replaced a stale spreadsheet with a stale database. Set a monthly calendar reminder to run
arp-scan -lon each VLAN and compare results against IPAM. Flag any IP that appears on the network but not in IPAM.