Skip to content

PXE Boot: From Network to Running Server

  • lesson
  • pxe
  • dhcp
  • tftp
  • ipxe
  • kickstart/preseed/autoinstall
  • ansible
  • bare-metal-provisioning
  • l2 ---# PXE Boot: From Network Cable to Running Server

Topics: PXE, DHCP, TFTP, iPXE, kickstart/preseed/autoinstall, Ansible, bare metal provisioning Level: L2 (Operations) Time: 75–90 minutes Prerequisites: Basic networking (DHCP, IP addressing) helpful but explained as we go


The Mission

You just racked 20 new servers in a datacenter. They have no operating system — just bare metal, a network cable, and a BIOS. You need all 20 running Ubuntu 22.04 with your base configuration (users, SSH keys, packages, monitoring agent) within the hour. You're not going to walk to each server with a USB stick.

This is what PXE boot was designed for: boot a machine from the network, install an OS automatically, and hand it off to configuration management (Ansible) — all without touching the physical server after racking it.

This lesson follows the entire chain from "power on with no OS" to "fully configured server running in production," explaining every protocol, every handshake, and every configuration file along the way.


The Big Picture

[1] Server powers on → BIOS/UEFI → PXE ROM on NIC
[2] PXE client → DHCP request → gets IP + "boot from this TFTP server"
[3] PXE client → TFTP download → gets bootloader (pxelinux/GRUB/iPXE)
[4] Bootloader → downloads kernel + initrd over TFTP or HTTP
[5] Kernel boots → installer runs → automated install (kickstart/preseed/autoinstall)
[6] OS installed → first boot → Ansible takes over → configured server

Six stages, four protocols (DHCP, TFTP, HTTP, SSH), and when it works, zero human interaction after plugging in the network cable.


Stage 1: PXE — The Network Boot ROM

PXE (Preboot eXecution Environment, pronounced "pixie") is firmware built into nearly every server network card since the late 1990s. When the BIOS/UEFI is configured to boot from network, PXE takes over.

PXE can't do much — it's a tiny ROM with just enough code to: 1. Get an IP address via DHCP 2. Download a small bootloader via TFTP 3. Execute that bootloader

That's it. PXE is the spark that starts the chain. Everything interesting happens in the bootloader and beyond.

Name Origin: PXE was developed by Intel in 1999 as part of the "Wired for Management" specification. The pronunciation "pixie" was chosen to sound like a helpful sprite that magically bootstraps your server. The spec was designed for large-scale datacenter provisioning — exactly what we're using it for.

BIOS vs UEFI PXE

The PXE process differs slightly depending on firmware:

BIOS PXE UEFI PXE
Bootloader format 16-bit, legacy (pxelinux.0) 64-bit EFI binary (grubx64.efi or ipxe.efi)
DHCP option filename "pxelinux.0" filename "grubx64.efi"
Config file pxelinux.cfg/default grub/grub.cfg
Protocol TFTP only TFTP or HTTP

Most modern servers use UEFI. The DHCP server needs to know which firmware the client uses and send the right bootloader. This is handled by DHCP option 93 (client system architecture):

# In ISC DHCP, serve different files based on architecture
class "pxe-uefi" {
    match if option architecture-type = 00:07;   # x86-64 UEFI
}
class "pxe-bios" {
    match if option architecture-type = 00:00;   # x86 BIOS
}

Stage 2: DHCP — Getting an Address and a Boot Directive

The PXE ROM broadcasts a DHCP Discover. The DHCP server responds with:

  1. IP address — so the server can communicate
  2. Subnet mask, gateway, DNS — standard networking
  3. Next-server (option 66) — IP of the TFTP server
  4. Filename (option 67) — path to the bootloader on the TFTP server
# ISC DHCP configuration for PXE boot
subnet 10.0.1.0 netmask 255.255.255.0 {
    range 10.0.1.100 10.0.1.200;
    option routers 10.0.1.1;
    option domain-name-servers 10.0.1.1;

    # PXE boot options
    next-server 10.0.1.10;                    # TFTP server IP
    filename "pxelinux.0";                     # Bootloader file (BIOS)
    # For UEFI: filename "grubx64.efi";

    # Optional: per-host config by MAC address
    host server-01 {
        hardware ethernet aa:bb:cc:dd:ee:01;
        fixed-address 10.0.1.51;
        option host-name "server-01";
    }
    host server-02 {
        hardware ethernet aa:bb:cc:dd:ee:02;
        fixed-address 10.0.1.52;
        option host-name "server-02";
    }
}

Under the Hood: The DHCP exchange happens via broadcast (the server has no IP yet). The sequence is DORA: Discover (broadcast, "anyone have an IP for me?"), Offer (server proposes an IP), Request (client accepts), Acknowledge (server confirms). All four packets are UDP — client port 68, server port 67. The PXE ROM adds vendor-specific options to the Discover that identify it as a PXE client, which tells the DHCP server to include the boot filename.

Gotcha: DHCP broadcasts don't cross routers. If your DHCP server is on a different subnet, you need a DHCP relay agent (ip helper-address on Cisco, dhcp-relay on Linux) on the router. The relay agent receives the broadcast, sets the giaddr (gateway address) to its own IP (so the DHCP server knows which subnet pool to allocate from), and forwards as unicast to the DHCP server.


Stage 3: TFTP — Downloading the Bootloader

PXE downloads the bootloader using TFTP (Trivial File Transfer Protocol). TFTP is deliberately simple — no authentication, no directory listing, no encryption. It's UDP-based and can transfer files in environments where a full TCP/IP stack isn't available yet.

# Set up a TFTP server
sudo apt install tftpd-hpa

# TFTP root directory
ls /srv/tftp/
# → pxelinux.0               ← BIOS bootloader
# → ldlinux.c32              ← syslinux library
# → grubx64.efi              ← UEFI bootloader
# → pxelinux.cfg/            ← BIOS boot configuration
# │   └── default            ← default config for all clients
# │   └── 01-aa-bb-cc-dd-ee-01  ← config for specific MAC
# → ubuntu-installer/
# │   └── linux               ← kernel
# │   └── initrd.gz           ← initrd

pxelinux.cfg lookup order

When pxelinux loads, it looks for configuration files in this order:

1. 01-aa-bb-cc-dd-ee-ff     ← MAC address (with dashes, prefixed 01-)
2. C0A80133                  ← IP in hex (192.168.1.51 = C0A80133)
3. C0A8013                   ← progressively shorter prefixes
4. C0A801
5. C0A80
6. C0A8
7. C0A
8. C0
9. C
10. default                  ← fallback for all clients

This lets you assign different configurations per server (by MAC) or per subnet (by IP prefix), with a default fallback.

A pxelinux configuration

# /srv/tftp/pxelinux.cfg/default
DEFAULT install
PROMPT 0
TIMEOUT 30

LABEL install
    KERNEL ubuntu-installer/linux
    APPEND initrd=ubuntu-installer/initrd.gz auto=true url=http://10.0.1.10/preseed.cfg locale=en_US.UTF-8 keyboard-configuration/layoutcode=us netcfg/choose_interface=auto

This tells the server: download the kernel and initrd, then boot into the Ubuntu installer with an automated preseed configuration.

Trivia: TFTP (RFC 783, 1981) was designed to be simple enough to fit in a boot ROM. It has no authentication, no directory listing, and no encryption — security wasn't a concern for local network bootstrapping in 1981. The protocol uses 512-byte blocks by default (extended to 65535 in RFC 2348), which makes large transfers painfully slow. This is why modern PXE setups use TFTP only for the initial bootloader and switch to HTTP for everything else.


Stage 4: iPXE — The Better PXE

TFTP is slow and limited. iPXE is an open-source replacement for the PXE ROM that adds:

  • HTTP/HTTPS downloads (much faster than TFTP)
  • Scripting (conditional logic, menus)
  • iSCSI, NFS, and other protocols
  • DNS resolution
  • VLAN tagging

The typical workflow: PXE loads a tiny iPXE binary via TFTP, then iPXE takes over and downloads everything else via HTTP:

PXE ROM → TFTP → iPXE binary → HTTP → kernel + initrd + installer

An iPXE script

#!ipxe
# /srv/tftp/boot.ipxe

# Dynamic menu
menu PXE Boot Menu
item ubuntu    Install Ubuntu 22.04
item rescue    Rescue Shell
item local     Boot from local disk
choose --default ubuntu --timeout 10000 target

goto ${target}

:ubuntu
set base-url http://10.0.1.10/ubuntu
kernel ${base-url}/vmlinuz initrd=initrd autoinstall ds=nocloud-net;s=http://10.0.1.10/autoinstall/
initrd ${base-url}/initrd
boot

:rescue
kernel http://10.0.1.10/rescue/vmlinuz
initrd http://10.0.1.10/rescue/initrd.gz
boot

:local
exit

This is dramatically more capable than raw PXE — HTTP is 10-100x faster than TFTP for large files, and scripting lets you build menus, conditionals, and error handling.

# ISC DHCP configured for iPXE chainloading
if exists user-class and option user-class = "iPXE" {
    filename "http://10.0.1.10/boot.ipxe";    # iPXE gets HTTP script
} else {
    filename "ipxe.efi";                        # PXE gets iPXE binary via TFTP
}

Stage 5: Automated OS Installation

The kernel and initrd are loaded. The installer starts. But instead of asking questions (language, timezone, disk layout, packages), it reads an answer file that provides everything automatically.

Three automation systems

Distro System Format
Ubuntu (modern) autoinstall YAML (cloud-init based)
Ubuntu (legacy) / Debian preseed Debconf format
RHEL / CentOS / Fedora kickstart Anaconda format

Ubuntu autoinstall example

# /srv/http/autoinstall/user-data
#cloud-config
autoinstall:
  version: 1
  locale: en_US.UTF-8
  keyboard:
    layout: us

  # Network: use DHCP on first interface
  network:
    version: 2
    ethernets:
      id0:
        match:
          name: en*
        dhcp4: true

  # Disk: entire first disk, LVM
  storage:
    layout:
      name: lvm
      sizing-policy: all

  # Users
  identity:
    hostname: server-01
    username: deploy
    password: "$6$rounds=4096$salt$hashedpassword"   # mkpasswd -m sha-512

  # SSH
  ssh:
    install-server: true
    authorized-keys:
      - "ssh-ed25519 AAAA... deploy@management"
    allow-pw: false

  # Packages
  packages:
    - python3
    - python3-apt
    - openssh-server
    - curl
    - vim
    - monitoring-agent

  # Post-install: signal the provisioning server
  late-commands:
    - curtin in-target -- systemctl enable monitoring-agent
    - curtin in-target -- mkdir -p /home/deploy/.ssh
    - "curl -X POST http://10.0.1.10:8080/api/provisioned -d '{\"hostname\": \"server-01\"}'"

The late-commands section runs after installation is complete but before the first reboot. This is where you can signal your provisioning system that the server is ready for configuration management.

RHEL/CentOS kickstart example

# /srv/http/ks/server.ks
#version=RHEL9
text
url --url="http://10.0.1.10/rhel9/"
lang en_US.UTF-8
keyboard us
timezone America/New_York --utc
rootpw --lock
user --name=deploy --groups=wheel --iscrypted --password="$6$..."
sshkey --username=deploy "ssh-ed25519 AAAA... deploy@management"

# Disk
bootloader --location=mbr
clearpart --all --initlabel
autopart --type=lvm

# Network
network --bootproto=dhcp --device=link --activate --hostname=server-01

# Packages
%packages
@core
python3
openssh-server
curl
vim
%end

# Post-install
%post
systemctl enable sshd
curl -X POST http://10.0.1.10:8080/api/provisioned -d '{"hostname": "server-01"}'
%end

reboot

Gotcha: The answer file contains hashed passwords, SSH keys, and server configuration. Serve it over HTTPS or a private network — not the public internet. Anyone who can reach the HTTP server can read your provisioning config.


Stage 6: Ansible Takes Over — From Bare OS to Production

The OS is installed. The server can be reached via SSH. Now Ansible handles everything the installer didn't:

# playbook.yml — base server configuration
---
- name: Configure new server
  hosts: new_servers
  become: true

  vars:
    ntp_servers:
      - 10.0.1.1
      - 10.0.1.2
    monitoring_endpoint: "https://monitoring.internal/api/v1"

  tasks:
    # --- System Configuration ---
    - name: Set hostname from inventory
      hostname:
        name: "{{ inventory_hostname }}"

    - name: Configure NTP
      template:
        src: templates/chrony.conf.j2
        dest: /etc/chrony/chrony.conf
      notify: restart chrony

    - name: Configure sysctl for production
      sysctl:
        name: "{{ item.key }}"
        value: "{{ item.value }}"
        state: present
        reload: yes
      loop:
        - { key: "net.core.somaxconn", value: "65535" }
        - { key: "vm.swappiness", value: "10" }
        - { key: "net.ipv4.tcp_max_syn_backlog", value: "65535" }

    # --- Security Hardening ---
    - name: Disable password authentication
      lineinfile:
        path: /etc/ssh/sshd_config
        regexp: "^#?PasswordAuthentication"
        line: "PasswordAuthentication no"
      notify: restart sshd

    - name: Configure firewall
      ufw:
        rule: allow
        port: "{{ item }}"
        proto: tcp
      loop:
        - "22"    # SSH
        - "9100"  # Node exporter

    - name: Enable and start firewall
      ufw:
        state: enabled
        policy: deny

    # --- Monitoring ---
    - name: Install node_exporter
      apt:
        name: prometheus-node-exporter
        state: present

    - name: Register with monitoring
      uri:
        url: "{{ monitoring_endpoint }}/targets"
        method: POST
        body_format: json
        body:
          hostname: "{{ inventory_hostname }}"
          ip: "{{ ansible_default_ipv4.address }}"
          labels:
            role: "{{ server_role | default('generic') }}"
            datacenter: "{{ datacenter | default('dc1') }}"

  handlers:
    - name: restart chrony
      service: name=chrony state=restarted

    - name: restart sshd
      service: name=sshd state=restarted

The provisioning callback

The server was just installed — how does Ansible know to run? Three approaches:

1. Pull-based callback — the server calls Ansible Tower/AWX on first boot:

# In autoinstall late-commands or kickstart %post
late-commands:
  - "curl -k --data 'host_config_key=mysecret' https://ansible.internal/api/v2/job_templates/42/callback/"

2. Dynamic inventory — Ansible polls the provisioning API for new servers:

# dynamic_inventory.py
#!/usr/bin/env python3
import json, requests
hosts = requests.get("http://provisioner:8080/api/new-servers").json()
inventory = {"new_servers": {"hosts": [h["ip"] for h in hosts]}}
print(json.dumps(inventory))
ansible-playbook -i dynamic_inventory.py playbook.yml

3. Event-driven — the provisioning server triggers Ansible when a server reports in:

# Webhook handler on the provisioning server
# When POST /api/provisioned arrives:
ansible-playbook -i "10.0.1.51," -e "target_host=server-01" playbook.yml

The Complete Chain — One Picture

[1] Power on → BIOS/UEFI → PXE ROM activates
    "I have no OS. Let me ask the network."

[2] DHCP exchange (DORA)
    Server gets: IP address + "boot from 10.0.1.10, file pxelinux.0"

[3] TFTP download
    PXE downloads bootloader (pxelinux.0 or iPXE binary)

[4] Bootloader reads config
    pxelinux.cfg/default or iPXE script
    Downloads kernel + initrd (via TFTP or HTTP)

[5] Kernel boots → installer runs
    Reads autoinstall/preseed/kickstart from HTTP server
    Partitions disk, installs OS, configures users/SSH
    Signals provisioning server: "I'm done"

[6] First reboot → OS boots from local disk
    SSH is up, deploy user has key access

[7] Ansible runs (callback, dynamic inventory, or event)
    Configures NTP, sysctl, firewall, monitoring, app-specific setup
    Registers with monitoring and load balancer

[8] Server is production-ready
    Total time: 15-30 minutes per server, zero human interaction

For 20 servers: rack them, cable them, set BIOS to network boot, power on. Walk away. Come back in 30 minutes to 20 production-ready servers.


Debugging PXE Boot Failures

PXE failures are frustrating because there's minimal feedback — the server has no OS to give you error messages. Here's the systematic approach:

Symptom Failed stage What to check
"No boot device found" Stage 1 — PXE not enabled BIOS/UEFI boot order, enable network boot
"PXE-E51: No DHCP or proxyDHCP offers" Stage 2 — DHCP Is DHCP server running? Is the NIC on the right VLAN? DHCP relay configured?
"PXE-E32: TFTP open timeout" Stage 3 — TFTP Is TFTP server running? Firewall blocking UDP 69? Correct next-server IP?
"PXE-E3B: TFTP Error - File Not Found" Stage 3 — wrong path Does the file exist at the TFTP root? Correct filename in DHCP?
Kernel panic during boot Stage 4 — wrong kernel/initrd Matching kernel and initrd versions? Correct architecture (amd64 vs arm64)?
Installer hangs asking questions Stage 5 — bad answer file URL reachable? Syntax valid? Missing required fields?
# Debug DHCP
sudo tcpdump -i eth0 port 67 or port 68 -n
# Look for: DHCP Discover from the server's MAC
# Look for: DHCP Offer with filename and next-server

# Debug TFTP
sudo tcpdump -i eth0 port 69 -n
# Look for: TFTP Read Request for the bootloader file

# Test TFTP manually
tftp 10.0.1.10 -c get pxelinux.0
# Does it download? If not, check TFTP server config and firewall

# Test answer file reachability
curl http://10.0.1.10/autoinstall/user-data
# Does it return the YAML? If not, check HTTP server config

Gotcha: The server's NIC must be on a VLAN where it can reach the DHCP server. If your provisioning VLAN is different from production, make sure the server is patched to the provisioning VLAN during install, then moved to production after. Many datacenter provisioning systems automate this VLAN change.


Flashcard Check

Q1: What does PXE stand for, and what can the PXE ROM actually do?

Preboot eXecution Environment ("pixie"). It can only: get an IP via DHCP, download a file via TFTP, and execute it. Everything else is the bootloader's job.

Q2: Which two DHCP options tell the client where to boot from?

Option 66 (next-server) = TFTP server IP. Option 67 (filename) = path to the bootloader file. Without both, PXE has nowhere to go.

Q3: Why use iPXE instead of raw PXE?

iPXE adds HTTP/HTTPS (10-100x faster than TFTP), scripting, menus, and more protocols. Raw PXE only speaks TFTP.

Q4: pxelinux.cfg lookup order — what's checked first?

The client's MAC address (format: 01-aa-bb-cc-dd-ee-ff). Then IP in hex with progressively shorter prefixes. Finally default. This enables per-server configuration.

Q5: "PXE-E51: No DHCP offers" — what broke?

DHCP server isn't responding. Check: is it running? Is the server on the right VLAN? Is a DHCP relay needed (server and DHCP on different subnets)?

Q6: How does Ansible know to configure a freshly installed server?

Three approaches: provisioning callback (server calls Ansible on first boot), dynamic inventory (Ansible polls for new servers), or event-driven (webhook triggers playbook).


Exercises

Exercise 1: Plan a PXE setup (design)

You're provisioning 50 servers across two racks. Design the component layout:

  • Where does the DHCP server go?
  • Where does the TFTP/HTTP server go?
  • How do you handle BIOS vs UEFI servers?
  • How does Ansible get triggered?
One approach Put DHCP + TFTP + HTTP on one provisioning server (`10.0.1.10`) on the management VLAN. Both racks connect to this VLAN. DHCP config uses `match if` to serve different bootloader files for BIOS vs UEFI. The autoinstall `late-commands` calls back to Ansible Tower (`curl POST /callback/`). Ansible runs the base playbook, then role-specific playbooks based on a CMDB lookup of each server's intended role.

Exercise 2: Write the autoinstall (hands-on)

Write an Ubuntu autoinstall user-data file that: 1. Partitions the first disk with LVM (100GB root, rest for data) 2. Creates a user ops with an SSH key 3. Installs Python 3, Docker, and monitoring-agent 4. Disables password authentication 5. Signals a webhook when done

Exercise 3: The decision (think)

For each scenario, should you use PXE or something else?

  1. Provisioning 100 bare-metal servers in a datacenter
  2. Setting up a single Raspberry Pi at home
  3. Re-imaging a developer laptop
  4. Provisioning VMs in AWS
  5. Deploying a Kubernetes cluster on bare metal
Answers 1. **PXE** — this is exactly what it's for. Zero-touch provisioning at scale. 2. **USB/SD card** — PXE is overkill for one device. Write the image directly. 3. **USB or network recovery** — PXE works but might be over-engineered for one laptop. MDM tools (Jamf, Intune) are more appropriate. 4. **Cloud APIs** — AWS has no PXE. Use AMIs, user-data scripts, and Terraform. PXE is for bare metal. 5. **PXE + Ansible + kubeadm/k3s** — PXE installs the base OS, Ansible configures the nodes, then bootstraps the cluster. This is the standard approach for on-prem Kubernetes.

Cheat Sheet

PXE Boot Chain

BIOS/UEFI → PXE ROM → DHCP (get IP + boot path) → TFTP (get bootloader)
→ Bootloader (get kernel) → Kernel (run installer) → Reboot → Ansible

DHCP for PXE (ISC DHCP)

next-server 10.0.1.10;          # TFTP server
filename "pxelinux.0";           # BIOS
# filename "grubx64.efi";        # UEFI

TFTP Directory Layout

/srv/tftp/
├── pxelinux.0 (or grubx64.efi)
├── pxelinux.cfg/
│   ├── default
│   └── 01-aa-bb-cc-dd-ee-ff     # Per-MAC config
└── ubuntu/
    ├── vmlinuz
    └── initrd

Debug Commands

Task Command
Watch DHCP sudo tcpdump -i eth0 port 67 or port 68 -n
Watch TFTP sudo tcpdump -i eth0 port 69 -n
Test TFTP tftp server-ip -c get filename
Test HTTP curl http://server-ip/path/to/file
Check DHCP leases cat /var/lib/dhcp/dhcpd.leases

Takeaways

  1. PXE is just the spark. It only knows DHCP + TFTP. iPXE adds HTTP, scripting, and menus. Use iPXE for anything beyond the simplest setup.

  2. The answer file is the blueprint. autoinstall (Ubuntu), kickstart (RHEL), preseed (Debian) — they all do the same thing: answer the installer's questions automatically.

  3. TFTP is slow and insecure. Use it only for the initial bootloader. Switch to HTTP for kernel, initrd, and installer packages.

  4. The callback completes the loop. The freshly installed server signals the provisioning system, which triggers Ansible. Without this, you need manual intervention to start configuration management.

  5. PXE is for bare metal. Cloud VMs use AMIs/images and user-data. Containers use Dockerfiles. PXE fills the gap where no hypervisor or image service exists.

  6. Debug with tcpdump. PXE failures have minimal feedback. Watching DHCP and TFTP traffic on the wire tells you exactly where the chain breaks.


  • What Happens When You Press Power — the boot sequence PXE replaces
  • From Init Scripts to systemd — what takes over after the OS is installed
  • Deploy a Web App From Nothing — where PXE ends and application deployment begins