Skip to content

Bare-Metal Provisioning - Street-Level Ops

Real-world patterns and gotchas from provisioning servers and switches at datacenter scale.

Quick Diagnosis Commands

# Check if DHCP is handing out addresses
tcpdump -i eth0 -n port 67 or port 68

# Verify TFTP is serving files
tcpdump -i eth0 -n port 69

# Check PXE server logs (dnsmasq)
journalctl -u dnsmasq -f
tail -f /var/log/syslog | grep -i 'dhcp\|tftp\|pxe'

# Verify kickstart file is accessible
curl -sf http://provision.internal/kickstart/aa-bb-cc-dd-ee-ff.ks | head

# Check BMC reachability
for bmc in 10.0.99.{50..70}; do
    ping -c1 -W1 "${bmc}" &>/dev/null && echo "${bmc} up" || echo "${bmc} DOWN"
done

# Bulk IPMI power status
while read -r name bmc_ip; do
    status=$(ipmitool -I lanplus -H "${bmc_ip}" -U admin -P secret \
        chassis power status 2>/dev/null || echo "UNREACHABLE")
    printf '%-20s %-15s %s\n' "${name}" "${bmc_ip}" "${status}"
done < bmc-inventory.txt

One-liner: If tcpdump -i eth0 -n port 67 shows DHCP DISCOVERs but no OFFERs, your DHCP server is not responding — check it is listening on the right interface and has leases available.

Gotcha: UEFI vs Legacy BIOS PXE

Modern servers default to UEFI. If your PXE setup only serves pxelinux.0 (BIOS), UEFI clients will fail silently — they'll just skip PXE and boot from disk.

Fix: Serve both bootloaders. Use DHCP option architecture matching to send the right filename: - Architecture 00:07 (EFI x64) → ipxe/snponly.efi or grubx64.efi - Architecture 00:00 (BIOS) → pxelinux.0

Check what the server is requesting: tcpdump -i eth0 -vvv port 67 | grep -i arch

Gotcha: TFTP Timeout on Large Files

TFTP transfers initrd images (often 50-100 MB) in 512-byte blocks. Over lossy or congested networks, this is painfully slow and prone to timeout.

Fix: Use iPXE chainloading. Let PXE boot a small iPXE binary via TFTP, then iPXE downloads the kernel and initrd over HTTP. HTTP is 10-50x faster and handles retries gracefully.

Under the hood: TFTP uses 512-byte blocks with stop-and-wait acknowledgment. A 60MB initrd requires ~120,000 round trips. At 1ms RTT that is 2 minutes minimum; over a congested switch it can be 10+ minutes. iPXE over HTTP uses TCP with windowing, finishing the same transfer in seconds.

Gotcha: Serial Console Not Configured

You PXE-boot a headless server. The installer starts but you can't see it — no video output, no serial console configured. The install hangs waiting for input on an invisible prompt.

Fix: Always pass console parameters in the kernel command line:

console=tty0 console=ttyS1,115200n8
And configure the kickstart to be fully unattended (no interactive prompts).

Default trap: The last console= parameter in the kernel command line becomes the primary console. If you list console=tty0 console=ttyS1,115200n8, serial gets primary output. Reverse the order if you want VGA as primary and serial as secondary.

Gotcha: IPMI Default Credentials

New servers ship with default BMC passwords (root/calvin on Dell, Administrator/password on HP). If you don't change them during provisioning, anyone on the OOB network can power-cycle your servers.

Fix: Include BMC credential rotation in your provisioning pipeline:

# Change IPMI password during post-install
ipmitool user set password 2 "${GENERATED_PASSWORD}"
# Store in secrets manager, not in scripts

Pattern: Provisioning Workflow with Ansible

# provision-server.yml
- name: Provision bare-metal server
  hosts: localhost
  vars:
    target_mac: "aa:bb:cc:dd:ee:ff"
    target_bmc: "10.0.99.50"
    hostname: "web-42"
  tasks:
    - name: Generate kickstart from template
      template:
        src: kickstart.ks.j2
        dest: "/var/www/kickstart/{{ target_mac }}.ks"

    - name: Set PXE boot for next reboot
      command: >
        ipmitool -I lanplus -H {{ target_bmc }}
        -U admin -P {{ bmc_password }}
        chassis bootdev pxe options=efiboot

    - name: Power cycle server
      command: >
        ipmitool -I lanplus -H {{ target_bmc }}
        -U admin -P {{ bmc_password }}
        chassis power cycle

    - name: Wait for server to come up on new OS
      wait_for:
        host: "{{ hostname }}"
        port: 22
        delay: 120
        timeout: 900

    - name: Run post-install validation
      command: ssh {{ hostname }} '/opt/provision/validate.sh'
      register: validation
      failed_when: validation.rc != 0

Pattern: Switch Provisioning with ONIE

#!/usr/bin/env bash
# Provision a new Cumulus Linux switch

set -euo pipefail

SWITCH_NAME=$1
SWITCH_MGMT_IP=$2
ONIE_IP=$3  # Temporary ONIE DHCP address

log() { echo "[$(date -u '+%H:%M:%S')] $*"; }

# 1. Wait for ONIE discovery
log "Waiting for ${SWITCH_NAME} to enter ONIE install mode..."
until ping -c1 -W1 "${ONIE_IP}" &>/dev/null; do
    sleep 5
done

# 2. Push installer (ONIE will HTTP-discover it from the provisioning server)
log "ONIE is up. Installer should auto-download..."

# 3. Wait for Cumulus to boot
log "Waiting for Cumulus Linux to come up on ${SWITCH_MGMT_IP}..."
sleep 120  # Initial install takes ~90s
until ssh -o ConnectTimeout=5 cumulus@"${SWITCH_MGMT_IP}" 'true' 2>/dev/null; do
    sleep 10
done

# 4. Apply configuration
log "Applying switch configuration..."
scp "configs/${SWITCH_NAME}.conf" cumulus@"${SWITCH_MGMT_IP}":/tmp/
ssh cumulus@"${SWITCH_MGMT_IP}" 'sudo cp /tmp/*.conf /etc/cumulus/ && sudo ifreload -a'

# 5. Validate
log "Validating..."
ssh cumulus@"${SWITCH_MGMT_IP}" 'net show interface | head -20'

log "Switch ${SWITCH_NAME} provisioned successfully"

Pattern: Firmware Update Pipeline

# Batch firmware update for Dell servers
declare -A FW_CATALOG=(
    ["BIOS"]="BIOS_8JKHY_LN64_2.18.1.BIN"
    ["iDRAC"]="iDRAC-with-Lifecycle-Controller_Firmware_6PPNP_LN64_7.00.00.171.BIN"
    ["NIC"]="Network_Firmware_WN783_LN64_22.31.10.12.BIN"
)

for host in "${HOSTS[@]}"; do
    bmc_ip="${BMC_MAP[${host}]}"
    log INFO "Updating firmware on ${host} (BMC: ${bmc_ip})"

    for component in BIOS iDRAC NIC; do
        fw_file="${FW_CATALOG[${component}]}"
        log INFO "  ${component}: ${fw_file}"

        racadm -r "${bmc_ip}" -u admin -p "${BMC_PASS}" \
            update -f "/firmware/${fw_file}" -l /tmp/ \
            2>&1 | tee -a "/var/log/fw-update/${host}.log"
    done

    # Schedule reboot to apply BIOS update
    racadm -r "${bmc_ip}" -u admin -p "${BMC_PASS}" \
        jobqueue create BIOS.Setup.1-1 -r graceful
done

Emergency: Server Won't PXE Boot

# 1. Check BMC can see the server
ipmitool -I lanplus -H <BMC_IP> -U admin -P secret chassis power status

# 2. Force PXE on next boot
ipmitool -I lanplus -H <BMC_IP> -U admin -P secret chassis bootdev pxe

# 3. Watch the console via SOL
ipmitool -I lanplus -H <BMC_IP> -U admin -P secret sol activate

# 4. Power cycle and watch
ipmitool -I lanplus -H <BMC_IP> -U admin -P secret chassis power cycle

# If PXE still doesn't work:
# - Check NIC is on the provisioning VLAN
# - Check DHCP server has a lease for the MAC
# - Check BIOS boot order includes PXE
# - Check UEFI Secure Boot isn't blocking the bootloader

Emergency: Kickstart Install Fails Mid-Way

# Connect via SOL to see the error
ipmitool -I lanplus -H <BMC_IP> -U admin -P secret sol activate

# Common causes:
# 1. Bad disk — check SMART data after any reboot
#    smartctl -a /dev/sda
# 2. Network timeout — kickstart URL unreachable during install
#    Check provisioning server nginx/httpd logs
# 3. Partitioning conflict — existing RAID metadata
#    Add to kickstart: zerombr, clearpart --all
# 4. Memory errors — check SEL for ECC events
#    ipmitool sel list | grep -i 'memory\|ecc\|dimm'

War story: A team lost two days debugging kickstart failures on a batch of 20 new servers. The installs hung at partitioning. Root cause: the servers shipped with a legacy hardware RAID controller that presented virtual disks as /dev/sda, but the kickstart expected NVMe paths. Always verify disk device names via SOL before mass-provisioning a new hardware SKU.