PXE Boot: From Network to Running Server
- lesson
- pxe
- dhcp
- tftp
- ipxe
- kickstart/preseed/autoinstall
- ansible
- bare-metal-provisioning
- l2 ---# PXE Boot: From Network Cable to Running Server
Topics: PXE, DHCP, TFTP, iPXE, kickstart/preseed/autoinstall, Ansible, bare metal provisioning Level: L2 (Operations) Time: 75–90 minutes Prerequisites: Basic networking (DHCP, IP addressing) helpful but explained as we go
The Mission¶
You just racked 20 new servers in a datacenter. They have no operating system — just bare metal, a network cable, and a BIOS. You need all 20 running Ubuntu 22.04 with your base configuration (users, SSH keys, packages, monitoring agent) within the hour. You're not going to walk to each server with a USB stick.
This is what PXE boot was designed for: boot a machine from the network, install an OS automatically, and hand it off to configuration management (Ansible) — all without touching the physical server after racking it.
This lesson follows the entire chain from "power on with no OS" to "fully configured server running in production," explaining every protocol, every handshake, and every configuration file along the way.
The Big Picture¶
[1] Server powers on → BIOS/UEFI → PXE ROM on NIC
[2] PXE client → DHCP request → gets IP + "boot from this TFTP server"
[3] PXE client → TFTP download → gets bootloader (pxelinux/GRUB/iPXE)
[4] Bootloader → downloads kernel + initrd over TFTP or HTTP
[5] Kernel boots → installer runs → automated install (kickstart/preseed/autoinstall)
[6] OS installed → first boot → Ansible takes over → configured server
Six stages, four protocols (DHCP, TFTP, HTTP, SSH), and when it works, zero human interaction after plugging in the network cable.
Stage 1: PXE — The Network Boot ROM¶
PXE (Preboot eXecution Environment, pronounced "pixie") is firmware built into nearly every server network card since the late 1990s. When the BIOS/UEFI is configured to boot from network, PXE takes over.
PXE can't do much — it's a tiny ROM with just enough code to: 1. Get an IP address via DHCP 2. Download a small bootloader via TFTP 3. Execute that bootloader
That's it. PXE is the spark that starts the chain. Everything interesting happens in the bootloader and beyond.
Name Origin: PXE was developed by Intel in 1999 as part of the "Wired for Management" specification. The pronunciation "pixie" was chosen to sound like a helpful sprite that magically bootstraps your server. The spec was designed for large-scale datacenter provisioning — exactly what we're using it for.
BIOS vs UEFI PXE¶
The PXE process differs slightly depending on firmware:
| BIOS PXE | UEFI PXE | |
|---|---|---|
| Bootloader format | 16-bit, legacy (pxelinux.0) | 64-bit EFI binary (grubx64.efi or ipxe.efi) |
| DHCP option | filename "pxelinux.0" |
filename "grubx64.efi" |
| Config file | pxelinux.cfg/default |
grub/grub.cfg |
| Protocol | TFTP only | TFTP or HTTP |
Most modern servers use UEFI. The DHCP server needs to know which firmware the client uses and send the right bootloader. This is handled by DHCP option 93 (client system architecture):
# In ISC DHCP, serve different files based on architecture
class "pxe-uefi" {
match if option architecture-type = 00:07; # x86-64 UEFI
}
class "pxe-bios" {
match if option architecture-type = 00:00; # x86 BIOS
}
Stage 2: DHCP — Getting an Address and a Boot Directive¶
The PXE ROM broadcasts a DHCP Discover. The DHCP server responds with:
- IP address — so the server can communicate
- Subnet mask, gateway, DNS — standard networking
- Next-server (option 66) — IP of the TFTP server
- Filename (option 67) — path to the bootloader on the TFTP server
# ISC DHCP configuration for PXE boot
subnet 10.0.1.0 netmask 255.255.255.0 {
range 10.0.1.100 10.0.1.200;
option routers 10.0.1.1;
option domain-name-servers 10.0.1.1;
# PXE boot options
next-server 10.0.1.10; # TFTP server IP
filename "pxelinux.0"; # Bootloader file (BIOS)
# For UEFI: filename "grubx64.efi";
# Optional: per-host config by MAC address
host server-01 {
hardware ethernet aa:bb:cc:dd:ee:01;
fixed-address 10.0.1.51;
option host-name "server-01";
}
host server-02 {
hardware ethernet aa:bb:cc:dd:ee:02;
fixed-address 10.0.1.52;
option host-name "server-02";
}
}
Under the Hood: The DHCP exchange happens via broadcast (the server has no IP yet). The sequence is DORA: Discover (broadcast, "anyone have an IP for me?"), Offer (server proposes an IP), Request (client accepts), Acknowledge (server confirms). All four packets are UDP — client port 68, server port 67. The PXE ROM adds vendor-specific options to the Discover that identify it as a PXE client, which tells the DHCP server to include the boot filename.
Gotcha: DHCP broadcasts don't cross routers. If your DHCP server is on a different subnet, you need a DHCP relay agent (
ip helper-addresson Cisco,dhcp-relayon Linux) on the router. The relay agent receives the broadcast, sets thegiaddr(gateway address) to its own IP (so the DHCP server knows which subnet pool to allocate from), and forwards as unicast to the DHCP server.
Stage 3: TFTP — Downloading the Bootloader¶
PXE downloads the bootloader using TFTP (Trivial File Transfer Protocol). TFTP is deliberately simple — no authentication, no directory listing, no encryption. It's UDP-based and can transfer files in environments where a full TCP/IP stack isn't available yet.
# Set up a TFTP server
sudo apt install tftpd-hpa
# TFTP root directory
ls /srv/tftp/
# → pxelinux.0 ← BIOS bootloader
# → ldlinux.c32 ← syslinux library
# → grubx64.efi ← UEFI bootloader
# → pxelinux.cfg/ ← BIOS boot configuration
# │ └── default ← default config for all clients
# │ └── 01-aa-bb-cc-dd-ee-01 ← config for specific MAC
# → ubuntu-installer/
# │ └── linux ← kernel
# │ └── initrd.gz ← initrd
pxelinux.cfg lookup order¶
When pxelinux loads, it looks for configuration files in this order:
1. 01-aa-bb-cc-dd-ee-ff ← MAC address (with dashes, prefixed 01-)
2. C0A80133 ← IP in hex (192.168.1.51 = C0A80133)
3. C0A8013 ← progressively shorter prefixes
4. C0A801
5. C0A80
6. C0A8
7. C0A
8. C0
9. C
10. default ← fallback for all clients
This lets you assign different configurations per server (by MAC) or per subnet (by IP prefix), with a default fallback.
A pxelinux configuration¶
# /srv/tftp/pxelinux.cfg/default
DEFAULT install
PROMPT 0
TIMEOUT 30
LABEL install
KERNEL ubuntu-installer/linux
APPEND initrd=ubuntu-installer/initrd.gz auto=true url=http://10.0.1.10/preseed.cfg locale=en_US.UTF-8 keyboard-configuration/layoutcode=us netcfg/choose_interface=auto
This tells the server: download the kernel and initrd, then boot into the Ubuntu installer with an automated preseed configuration.
Trivia: TFTP (RFC 783, 1981) was designed to be simple enough to fit in a boot ROM. It has no authentication, no directory listing, and no encryption — security wasn't a concern for local network bootstrapping in 1981. The protocol uses 512-byte blocks by default (extended to 65535 in RFC 2348), which makes large transfers painfully slow. This is why modern PXE setups use TFTP only for the initial bootloader and switch to HTTP for everything else.
Stage 4: iPXE — The Better PXE¶
TFTP is slow and limited. iPXE is an open-source replacement for the PXE ROM that adds:
- HTTP/HTTPS downloads (much faster than TFTP)
- Scripting (conditional logic, menus)
- iSCSI, NFS, and other protocols
- DNS resolution
- VLAN tagging
The typical workflow: PXE loads a tiny iPXE binary via TFTP, then iPXE takes over and downloads everything else via HTTP:
An iPXE script¶
#!ipxe
# /srv/tftp/boot.ipxe
# Dynamic menu
menu PXE Boot Menu
item ubuntu Install Ubuntu 22.04
item rescue Rescue Shell
item local Boot from local disk
choose --default ubuntu --timeout 10000 target
goto ${target}
:ubuntu
set base-url http://10.0.1.10/ubuntu
kernel ${base-url}/vmlinuz initrd=initrd autoinstall ds=nocloud-net;s=http://10.0.1.10/autoinstall/
initrd ${base-url}/initrd
boot
:rescue
kernel http://10.0.1.10/rescue/vmlinuz
initrd http://10.0.1.10/rescue/initrd.gz
boot
:local
exit
This is dramatically more capable than raw PXE — HTTP is 10-100x faster than TFTP for large files, and scripting lets you build menus, conditionals, and error handling.
# ISC DHCP configured for iPXE chainloading
if exists user-class and option user-class = "iPXE" {
filename "http://10.0.1.10/boot.ipxe"; # iPXE gets HTTP script
} else {
filename "ipxe.efi"; # PXE gets iPXE binary via TFTP
}
Stage 5: Automated OS Installation¶
The kernel and initrd are loaded. The installer starts. But instead of asking questions (language, timezone, disk layout, packages), it reads an answer file that provides everything automatically.
Three automation systems¶
| Distro | System | Format |
|---|---|---|
| Ubuntu (modern) | autoinstall | YAML (cloud-init based) |
| Ubuntu (legacy) / Debian | preseed | Debconf format |
| RHEL / CentOS / Fedora | kickstart | Anaconda format |
Ubuntu autoinstall example¶
# /srv/http/autoinstall/user-data
#cloud-config
autoinstall:
version: 1
locale: en_US.UTF-8
keyboard:
layout: us
# Network: use DHCP on first interface
network:
version: 2
ethernets:
id0:
match:
name: en*
dhcp4: true
# Disk: entire first disk, LVM
storage:
layout:
name: lvm
sizing-policy: all
# Users
identity:
hostname: server-01
username: deploy
password: "$6$rounds=4096$salt$hashedpassword" # mkpasswd -m sha-512
# SSH
ssh:
install-server: true
authorized-keys:
- "ssh-ed25519 AAAA... deploy@management"
allow-pw: false
# Packages
packages:
- python3
- python3-apt
- openssh-server
- curl
- vim
- monitoring-agent
# Post-install: signal the provisioning server
late-commands:
- curtin in-target -- systemctl enable monitoring-agent
- curtin in-target -- mkdir -p /home/deploy/.ssh
- "curl -X POST http://10.0.1.10:8080/api/provisioned -d '{\"hostname\": \"server-01\"}'"
The late-commands section runs after installation is complete but before the first reboot.
This is where you can signal your provisioning system that the server is ready for
configuration management.
RHEL/CentOS kickstart example¶
# /srv/http/ks/server.ks
#version=RHEL9
text
url --url="http://10.0.1.10/rhel9/"
lang en_US.UTF-8
keyboard us
timezone America/New_York --utc
rootpw --lock
user --name=deploy --groups=wheel --iscrypted --password="$6$..."
sshkey --username=deploy "ssh-ed25519 AAAA... deploy@management"
# Disk
bootloader --location=mbr
clearpart --all --initlabel
autopart --type=lvm
# Network
network --bootproto=dhcp --device=link --activate --hostname=server-01
# Packages
%packages
@core
python3
openssh-server
curl
vim
%end
# Post-install
%post
systemctl enable sshd
curl -X POST http://10.0.1.10:8080/api/provisioned -d '{"hostname": "server-01"}'
%end
reboot
Gotcha: The answer file contains hashed passwords, SSH keys, and server configuration. Serve it over HTTPS or a private network — not the public internet. Anyone who can reach the HTTP server can read your provisioning config.
Stage 6: Ansible Takes Over — From Bare OS to Production¶
The OS is installed. The server can be reached via SSH. Now Ansible handles everything the installer didn't:
# playbook.yml — base server configuration
---
- name: Configure new server
hosts: new_servers
become: true
vars:
ntp_servers:
- 10.0.1.1
- 10.0.1.2
monitoring_endpoint: "https://monitoring.internal/api/v1"
tasks:
# --- System Configuration ---
- name: Set hostname from inventory
hostname:
name: "{{ inventory_hostname }}"
- name: Configure NTP
template:
src: templates/chrony.conf.j2
dest: /etc/chrony/chrony.conf
notify: restart chrony
- name: Configure sysctl for production
sysctl:
name: "{{ item.key }}"
value: "{{ item.value }}"
state: present
reload: yes
loop:
- { key: "net.core.somaxconn", value: "65535" }
- { key: "vm.swappiness", value: "10" }
- { key: "net.ipv4.tcp_max_syn_backlog", value: "65535" }
# --- Security Hardening ---
- name: Disable password authentication
lineinfile:
path: /etc/ssh/sshd_config
regexp: "^#?PasswordAuthentication"
line: "PasswordAuthentication no"
notify: restart sshd
- name: Configure firewall
ufw:
rule: allow
port: "{{ item }}"
proto: tcp
loop:
- "22" # SSH
- "9100" # Node exporter
- name: Enable and start firewall
ufw:
state: enabled
policy: deny
# --- Monitoring ---
- name: Install node_exporter
apt:
name: prometheus-node-exporter
state: present
- name: Register with monitoring
uri:
url: "{{ monitoring_endpoint }}/targets"
method: POST
body_format: json
body:
hostname: "{{ inventory_hostname }}"
ip: "{{ ansible_default_ipv4.address }}"
labels:
role: "{{ server_role | default('generic') }}"
datacenter: "{{ datacenter | default('dc1') }}"
handlers:
- name: restart chrony
service: name=chrony state=restarted
- name: restart sshd
service: name=sshd state=restarted
The provisioning callback¶
The server was just installed — how does Ansible know to run? Three approaches:
1. Pull-based callback — the server calls Ansible Tower/AWX on first boot:
# In autoinstall late-commands or kickstart %post
late-commands:
- "curl -k --data 'host_config_key=mysecret' https://ansible.internal/api/v2/job_templates/42/callback/"
2. Dynamic inventory — Ansible polls the provisioning API for new servers:
# dynamic_inventory.py
#!/usr/bin/env python3
import json, requests
hosts = requests.get("http://provisioner:8080/api/new-servers").json()
inventory = {"new_servers": {"hosts": [h["ip"] for h in hosts]}}
print(json.dumps(inventory))
3. Event-driven — the provisioning server triggers Ansible when a server reports in:
# Webhook handler on the provisioning server
# When POST /api/provisioned arrives:
ansible-playbook -i "10.0.1.51," -e "target_host=server-01" playbook.yml
The Complete Chain — One Picture¶
[1] Power on → BIOS/UEFI → PXE ROM activates
"I have no OS. Let me ask the network."
[2] DHCP exchange (DORA)
Server gets: IP address + "boot from 10.0.1.10, file pxelinux.0"
[3] TFTP download
PXE downloads bootloader (pxelinux.0 or iPXE binary)
[4] Bootloader reads config
pxelinux.cfg/default or iPXE script
Downloads kernel + initrd (via TFTP or HTTP)
[5] Kernel boots → installer runs
Reads autoinstall/preseed/kickstart from HTTP server
Partitions disk, installs OS, configures users/SSH
Signals provisioning server: "I'm done"
[6] First reboot → OS boots from local disk
SSH is up, deploy user has key access
[7] Ansible runs (callback, dynamic inventory, or event)
Configures NTP, sysctl, firewall, monitoring, app-specific setup
Registers with monitoring and load balancer
[8] Server is production-ready
Total time: 15-30 minutes per server, zero human interaction
For 20 servers: rack them, cable them, set BIOS to network boot, power on. Walk away. Come back in 30 minutes to 20 production-ready servers.
Debugging PXE Boot Failures¶
PXE failures are frustrating because there's minimal feedback — the server has no OS to give you error messages. Here's the systematic approach:
| Symptom | Failed stage | What to check |
|---|---|---|
| "No boot device found" | Stage 1 — PXE not enabled | BIOS/UEFI boot order, enable network boot |
| "PXE-E51: No DHCP or proxyDHCP offers" | Stage 2 — DHCP | Is DHCP server running? Is the NIC on the right VLAN? DHCP relay configured? |
| "PXE-E32: TFTP open timeout" | Stage 3 — TFTP | Is TFTP server running? Firewall blocking UDP 69? Correct next-server IP? |
| "PXE-E3B: TFTP Error - File Not Found" | Stage 3 — wrong path | Does the file exist at the TFTP root? Correct filename in DHCP? |
| Kernel panic during boot | Stage 4 — wrong kernel/initrd | Matching kernel and initrd versions? Correct architecture (amd64 vs arm64)? |
| Installer hangs asking questions | Stage 5 — bad answer file | URL reachable? Syntax valid? Missing required fields? |
# Debug DHCP
sudo tcpdump -i eth0 port 67 or port 68 -n
# Look for: DHCP Discover from the server's MAC
# Look for: DHCP Offer with filename and next-server
# Debug TFTP
sudo tcpdump -i eth0 port 69 -n
# Look for: TFTP Read Request for the bootloader file
# Test TFTP manually
tftp 10.0.1.10 -c get pxelinux.0
# Does it download? If not, check TFTP server config and firewall
# Test answer file reachability
curl http://10.0.1.10/autoinstall/user-data
# Does it return the YAML? If not, check HTTP server config
Gotcha: The server's NIC must be on a VLAN where it can reach the DHCP server. If your provisioning VLAN is different from production, make sure the server is patched to the provisioning VLAN during install, then moved to production after. Many datacenter provisioning systems automate this VLAN change.
Flashcard Check¶
Q1: What does PXE stand for, and what can the PXE ROM actually do?
Preboot eXecution Environment ("pixie"). It can only: get an IP via DHCP, download a file via TFTP, and execute it. Everything else is the bootloader's job.
Q2: Which two DHCP options tell the client where to boot from?
Option 66 (
next-server) = TFTP server IP. Option 67 (filename) = path to the bootloader file. Without both, PXE has nowhere to go.
Q3: Why use iPXE instead of raw PXE?
iPXE adds HTTP/HTTPS (10-100x faster than TFTP), scripting, menus, and more protocols. Raw PXE only speaks TFTP.
Q4: pxelinux.cfg lookup order — what's checked first?
The client's MAC address (format:
01-aa-bb-cc-dd-ee-ff). Then IP in hex with progressively shorter prefixes. Finallydefault. This enables per-server configuration.
Q5: "PXE-E51: No DHCP offers" — what broke?
DHCP server isn't responding. Check: is it running? Is the server on the right VLAN? Is a DHCP relay needed (server and DHCP on different subnets)?
Q6: How does Ansible know to configure a freshly installed server?
Three approaches: provisioning callback (server calls Ansible on first boot), dynamic inventory (Ansible polls for new servers), or event-driven (webhook triggers playbook).
Exercises¶
Exercise 1: Plan a PXE setup (design)¶
You're provisioning 50 servers across two racks. Design the component layout:
- Where does the DHCP server go?
- Where does the TFTP/HTTP server go?
- How do you handle BIOS vs UEFI servers?
- How does Ansible get triggered?
One approach
Put DHCP + TFTP + HTTP on one provisioning server (`10.0.1.10`) on the management VLAN. Both racks connect to this VLAN. DHCP config uses `match if` to serve different bootloader files for BIOS vs UEFI. The autoinstall `late-commands` calls back to Ansible Tower (`curl POST /callback/`). Ansible runs the base playbook, then role-specific playbooks based on a CMDB lookup of each server's intended role.Exercise 2: Write the autoinstall (hands-on)¶
Write an Ubuntu autoinstall user-data file that:
1. Partitions the first disk with LVM (100GB root, rest for data)
2. Creates a user ops with an SSH key
3. Installs Python 3, Docker, and monitoring-agent
4. Disables password authentication
5. Signals a webhook when done
Exercise 3: The decision (think)¶
For each scenario, should you use PXE or something else?
- Provisioning 100 bare-metal servers in a datacenter
- Setting up a single Raspberry Pi at home
- Re-imaging a developer laptop
- Provisioning VMs in AWS
- Deploying a Kubernetes cluster on bare metal
Answers
1. **PXE** — this is exactly what it's for. Zero-touch provisioning at scale. 2. **USB/SD card** — PXE is overkill for one device. Write the image directly. 3. **USB or network recovery** — PXE works but might be over-engineered for one laptop. MDM tools (Jamf, Intune) are more appropriate. 4. **Cloud APIs** — AWS has no PXE. Use AMIs, user-data scripts, and Terraform. PXE is for bare metal. 5. **PXE + Ansible + kubeadm/k3s** — PXE installs the base OS, Ansible configures the nodes, then bootstraps the cluster. This is the standard approach for on-prem Kubernetes.Cheat Sheet¶
PXE Boot Chain¶
BIOS/UEFI → PXE ROM → DHCP (get IP + boot path) → TFTP (get bootloader)
→ Bootloader (get kernel) → Kernel (run installer) → Reboot → Ansible
DHCP for PXE (ISC DHCP)¶
TFTP Directory Layout¶
/srv/tftp/
├── pxelinux.0 (or grubx64.efi)
├── pxelinux.cfg/
│ ├── default
│ └── 01-aa-bb-cc-dd-ee-ff # Per-MAC config
└── ubuntu/
├── vmlinuz
└── initrd
Debug Commands¶
| Task | Command |
|---|---|
| Watch DHCP | sudo tcpdump -i eth0 port 67 or port 68 -n |
| Watch TFTP | sudo tcpdump -i eth0 port 69 -n |
| Test TFTP | tftp server-ip -c get filename |
| Test HTTP | curl http://server-ip/path/to/file |
| Check DHCP leases | cat /var/lib/dhcp/dhcpd.leases |
Takeaways¶
-
PXE is just the spark. It only knows DHCP + TFTP. iPXE adds HTTP, scripting, and menus. Use iPXE for anything beyond the simplest setup.
-
The answer file is the blueprint. autoinstall (Ubuntu), kickstart (RHEL), preseed (Debian) — they all do the same thing: answer the installer's questions automatically.
-
TFTP is slow and insecure. Use it only for the initial bootloader. Switch to HTTP for kernel, initrd, and installer packages.
-
The callback completes the loop. The freshly installed server signals the provisioning system, which triggers Ansible. Without this, you need manual intervention to start configuration management.
-
PXE is for bare metal. Cloud VMs use AMIs/images and user-data. Containers use Dockerfiles. PXE fills the gap where no hypervisor or image service exists.
-
Debug with tcpdump. PXE failures have minimal feedback. Watching DHCP and TFTP traffic on the wire tells you exactly where the chain breaks.
Related Lessons¶
- What Happens When You Press Power — the boot sequence PXE replaces
- From Init Scripts to systemd — what takes over after the OS is installed
- Deploy a Web App From Nothing — where PXE ends and application deployment begins