Skip to content

Firmware & BIOS - Primer

Why This Matters

Firmware is the first code that runs when a server powers on. It initializes hardware, runs diagnostics, and hands off to the bootloader. Firmware bugs cause boot failures, hardware instability, and security vulnerabilities. In datacenter operations, you manage firmware updates across hundreds of servers, often remotely via BMC. Understanding the boot chain and firmware management is essential for bare-metal ops.

BIOS vs UEFI

Name origin: BIOS stands for Basic Input/Output System, a term dating back to the CP/M operating system (1975). The IBM PC BIOS (1981) established the standard that lasted until UEFI. UEFI stands for Unified Extensible Firmware Interface — originally developed by Intel as EFI in the mid-1990s for the Itanium platform, then opened as a multi-vendor standard. The "Unified" was added in 2005 when the UEFI Forum took over governance from Intel.

Legacy BIOS

  • 16-bit real mode, 1MB addressable memory
  • MBR partitioning (max 2TB disks, 4 primary partitions)
  • No built-in networking or shell
  • Boot order configured in CMOS setup (DEL/F2 at POST)
  • No secure boot capability

UEFI (Unified Extensible Firmware Interface)

  • 32/64-bit, full memory access
  • GPT partitioning (no practical size limit, 128+ partitions)
  • ESP (EFI System Partition) stores bootloaders
  • Built-in shell, networking, driver model
  • Secure Boot: verifies bootloader signatures
  • NVRAM stores boot variables (accessible from OS)
# Check if system booted in UEFI or BIOS mode
[ -d /sys/firmware/efi ] && echo "UEFI" || echo "BIOS"

# List UEFI boot entries
efibootmgr -v

# Change boot order
efibootmgr -o 0001,0002,0003

# Add a boot entry
efibootmgr -c -d /dev/sda -p 1 -L "Linux" -l '\EFI\ubuntu\grubx64.efi'

POST (Power-On Self-Test)

Under the hood: POST is firmware code burned into the system ROM that runs before any software you installed. On modern servers, POST takes 30-90 seconds — most of that time is memory training (calibrating DIMM timing parameters). A server with 2TB of RAM can spend over 2 minutes in memory training alone. This is why "fast boot" options in BIOS exist — they skip or abbreviate POST tests, trading thoroughness for speed.

POST runs before the OS loads:

  1. CPU initialization
  2. Memory test (DIMM detection, ECC check)
  3. PCIe device enumeration
  4. Storage controller init
  5. Network controller init
  6. Video init
  7. Handoff to bootloader

POST errors are reported via: - Beep codes: patterns indicate specific failures (varies by vendor) - LED codes: front panel diagnostic LEDs (Dell: amber patterns) - BMC event log: recorded in SEL (System Event Log) - Video output: error messages on screen (if video initialized)

# Read BMC System Event Log (requires ipmitool)
ipmitool sel list
ipmitool sel elist    # extended format

# Check last POST errors
ipmitool sel list | grep -i "post\|boot\|memory\|cpu"

CMOS and NVRAM

CMOS stores BIOS settings (boot order, date/time, hardware config). Backed by a coin cell battery (CR2032).

Dead CMOS battery symptoms: - Clock resets on every boot - BIOS settings revert to defaults - "CMOS checksum error" on boot

UEFI stores settings in NVRAM (flash, no battery dependency for settings).

BMC (Baseboard Management Controller)

The BMC is an independent processor on the server motherboard that provides out-of-band management:

  • IPMI: Standard protocol for BMC access
  • iDRAC (Dell), iLO (HPE), IMM (Lenovo): Vendor BMC implementations
  • Remote console (KVM over IP)
  • Remote power control
  • Hardware monitoring (temps, fans, voltages)
  • Firmware update capability
  • Serial-over-LAN (SOL)
# Configure BMC network
ipmitool lan set 1 ipaddr 10.0.10.5
ipmitool lan set 1 netmask 255.255.255.0
ipmitool lan set 1 defgw ipaddr 10.0.10.1

# Power control
ipmitool power status
ipmitool power on
ipmitool power off
ipmitool power cycle
ipmitool power reset

# Read sensors
ipmitool sensor list
ipmitool sdr list        # Sensor Data Repository

# Check BMC firmware version
ipmitool mc info

Firmware Updates

Why Update

War story: The Spectre and Meltdown CPU vulnerabilities (2018) required both OS patches and BIOS/firmware microcode updates. Many organizations patched the OS but forgot the firmware, leaving systems partially exposed. Since then, firmware patching has been elevated from "nice to have" to a security compliance requirement. Tools like fwupd (Linux Vendor Firmware Service) make this easier by delivering firmware updates through the same channels as OS packages.

  • Security patches (BIOS rootkits, speculative execution mitigations)
  • Bug fixes (memory training, PCIe compatibility)
  • New hardware support
  • Performance improvements

Update Methods

Method When
BMC web interface Single server, GUI access
Vendor tools (Dell DSU, HPE SUM) Fleet updates, scriptable
UEFI shell Pre-OS update
Linux tools (fwupd) In-OS update
PXE/network boot Automated fleet provisioning
# fwupd — Linux firmware update daemon
fwupdmgr get-devices          # list updateable devices
fwupdmgr get-updates          # check for updates
fwupdmgr update               # apply updates

# Dell-specific (DSU)
dsu --inventory               # list installed firmware
dsu --apply-upgrades          # apply all available updates

Fleet Firmware Management

  • Stage firmware in a repository (Dell Repository Manager, HPE SPP)
  • Test on canary servers first
  • Schedule maintenance window (updates often require reboot)
  • Update BIOS, BMC, NIC, RAID controller, disk firmware
  • Verify post-update: check SEL, run diagnostics

Boot Order and Boot Process

Power On → POST → UEFI/BIOS → Boot Device Selection
  → Bootloader (GRUB2) → Kernel → init/systemd → OS
# GRUB2 — set default boot entry
grub2-set-default 0
grub2-mkconfig -o /boot/grub2/grub.cfg

# View current kernel command line
cat /proc/cmdline

# Check boot log
journalctl -b 0              # current boot
journalctl -b -1             # previous boot
journalctl --list-boots      # all recorded boots

Secure Boot

Gotcha: Secure Boot can prevent Linux from booting if the distribution's bootloader is not signed with a key trusted by the firmware. Most major distros (RHEL, Ubuntu, SUSE) ship signed bootloaders via Microsoft's UEFI CA. But custom-compiled kernels, third-party kernel modules (NVIDIA drivers, ZFS), and some smaller distros will fail Secure Boot unless you enroll custom keys using mokutil. If a server suddenly stops booting after a kernel update, Secure Boot key mismatch is a top suspect.

UEFI Secure Boot verifies that bootloaders and kernels are signed by trusted keys:

# Check Secure Boot status
mokutil --sb-state

# List enrolled keys
mokutil --list-enrolled

# Enroll a key (for custom kernels/modules)
mokutil --import my-key.der

Quick Reference

Task Command
UEFI or BIOS? [ -d /sys/firmware/efi ] && echo UEFI \|\| echo BIOS
List boot entries efibootmgr -v
BMC power control ipmitool power status/on/off/cycle
Read BMC sensors ipmitool sensor list
Check SEL ipmitool sel elist
Firmware updates fwupdmgr get-updates && fwupdmgr update
Secure Boot status mokutil --sb-state
Boot log journalctl -b 0

Wiki Navigation