Skip to content

IPMI and ipmitool -- Street Ops

Workflows and commands for production use. Every command here has been used to diagnose or recover a real server.

1. Power Operations

The most common reason to reach for ipmitool: the server is unresponsive and needs a power cycle.

# Check power state
ipmitool -I lanplus -H $BMC -U admin -P $PASS power status
# Chassis Power is on

# Graceful shutdown (ACPI signal — like pressing the power button)
ipmitool -I lanplus -H $BMC -U admin -P $PASS power soft

# Hard power off (like holding the power button for 5 seconds)
ipmitool -I lanplus -H $BMC -U admin -P $PASS power off

# Power on
ipmitool -I lanplus -H $BMC -U admin -P $PASS power on

# Power cycle (off, wait, on — single command)
ipmitool -I lanplus -H $BMC -U admin -P $PASS power cycle

# Hard reset (CPU reset, no power cycle — faster but less thorough)
ipmitool -I lanplus -H $BMC -U admin -P $PASS power reset

The Right Order for an Unresponsive Server

1. Try power soft — wait 60 seconds
2. Check SOL console — is the OS shutting down?
3. If no response after 60s → power cycle
4. If server doesn't come back after 2 min → check SOL for POST errors
5. If POST fails → read SEL, check sensors

Never jump straight to power cycle unless you have confirmed the OS is truly hung. A dirty shutdown means fsck, journal recovery, and potential data loss.

2. Sensor Monitoring

Quick Health Check

# All sensors, one-line summary each
ipmitool -I lanplus -H $BMC -U admin -P $PASS sdr list
# Inlet Temp       | 23 degrees C      | ok
# Exhaust Temp     | 37 degrees C      | ok
# Fan1             | 8400 RPM          | ok
# Fan2             | 8520 RPM          | ok
# Fan3             | 0 RPM             | cr    ← CRITICAL
# PSU1 Status      | 0x01              | ok
# PSU2 Status      | 0x09              | cr    ← CRITICAL

# Filter by type
ipmitool -I lanplus -H $BMC -U admin -P $PASS sdr type Temperature
ipmitool -I lanplus -H $BMC -U admin -P $PASS sdr type Fan
ipmitool -I lanplus -H $BMC -U admin -P $PASS sdr type "Power Supply"

Deep Dive on a Single Sensor

# Get all details including thresholds
ipmitool -I lanplus -H $BMC -U admin -P $PASS sensor get "Inlet Temp"
# Sensor ID              : Inlet Temp (0x4)
# Entity ID              : 64.1
# Sensor Type (Analog)   : Temperature
# Sensor Reading          : 23 (+/- 0) degrees C
# Status                  : ok
# Lower Non-Recoverable  : na
# Lower Critical          : 3.000
# Lower Non-Critical      : 8.000
# Upper Non-Critical      : 42.000
# Upper Critical          : 47.000
# Upper Non-Recoverable  : na
# Positive Hysteresis     : 1.000
# Negative Hysteresis     : 1.000

# Interpretation:
# 23°C is well within normal range (8-42°C non-critical window)
# If reading hits 42+, BMC logs a warning event
# If reading hits 47+, BMC logs a critical event (and may throttle/shutdown)
# Capture sensor snapshot to file (do this regularly or via cron)
ipmitool -I lanplus -H $BMC -U admin -P $PASS sdr list full \
    | ts '[%Y-%m-%d %H:%M:%S]' >> /var/log/ipmi-sensors-$(hostname).log

# Compare two snapshots
diff <(grep Temp /var/log/ipmi-sensors-snap1.txt) \
     <(grep Temp /var/log/ipmi-sensors-snap2.txt)

# Quick temperature trend (run in a loop)
watch -n 5 "ipmitool sensor list | grep -i temp"

3. System Event Log (SEL) Operations

Triage Workflow

# How full is the SEL?
ipmitool -I lanplus -H $BMC -U admin -P $PASS sel info
# Entries          : 312
# Free Space       : 2048 bytes     ← getting full

# Recent events (tail of the log)
ipmitool -I lanplus -H $BMC -U admin -P $PASS sel elist last 20

# All events (pipe to file for analysis)
ipmitool -I lanplus -H $BMC -U admin -P $PASS sel elist > /tmp/sel-$BMC.log

# Search for specific event types
ipmitool -I lanplus -H $BMC -U admin -P $PASS sel elist | grep -i "temperature"
ipmitool -I lanplus -H $BMC -U admin -P $PASS sel elist | grep -i "power supply"
ipmitool -I lanplus -H $BMC -U admin -P $PASS sel elist | grep -i "memory"
ipmitool -I lanplus -H $BMC -U admin -P $PASS sel elist | grep -i "critical"

SEL Time Management

The BMC has its own clock. If it drifts, SEL timestamps are wrong and you can't correlate with OS logs.

# Check BMC time
ipmitool -I lanplus -H $BMC -U admin -P $PASS sel time get
# 03/14/2024 10:15:33

# Set BMC time (use UTC)
ipmitool -I lanplus -H $BMC -U admin -P $PASS sel time set "03/14/2024 10:15:33"

# Sync BMC clock to host OS time (run from the host, in-band)
ipmitool sel time set "$(date -u '+%m/%d/%Y %H:%M:%S')"

Archive and Clear

# Always archive before clearing
ipmitool -I lanplus -H $BMC -U admin -P $PASS sel elist \
    > /var/log/sel-archive/$(hostname)-$(date +%F-%H%M).log

# Clear
ipmitool -I lanplus -H $BMC -U admin -P $PASS sel clear

# Automate with cron (weekly archive + clear)
# 0 2 * * 0  /usr/local/bin/archive-sel.sh

4. Boot Device Override

# Force PXE boot on next reboot only
ipmitool -I lanplus -H $BMC -U admin -P $PASS chassis bootdev pxe

# Force disk boot on next reboot
ipmitool -I lanplus -H $BMC -U admin -P $PASS chassis bootdev disk

# Force BIOS setup on next reboot
ipmitool -I lanplus -H $BMC -U admin -P $PASS chassis bootdev bios

# Force CD/DVD (virtual media) boot
ipmitool -I lanplus -H $BMC -U admin -P $PASS chassis bootdev cdrom

# Persistent boot device change (survives multiple reboots)
ipmitool -I lanplus -H $BMC -U admin -P $PASS chassis bootdev pxe options=persistent

# UEFI boot mode (required for UEFI systems)
ipmitool -I lanplus -H $BMC -U admin -P $PASS chassis bootdev pxe options=efiboot

# Check current boot parameters
ipmitool -I lanplus -H $BMC -U admin -P $PASS chassis bootparam get 5

Important: Without options=persistent, boot device override is one-shot — it applies to the next boot only, then reverts to the BIOS boot order. This is usually what you want.

5. Serial-over-LAN (SOL) Console

Connecting

# Activate SOL session
ipmitool -I lanplus -H $BMC -U admin -P $PASS sol activate

# You see the server's serial console output.
# If the server is at a login prompt, you can type credentials.
# If it's booting, you see POST/GRUB/kernel messages.

# Disconnect from SOL
# Press: ~.   (tilde, then period)
# If connected through SSH: ~~.
# If connected through nested SSH: ~~~.

SOL Troubleshooting

# "SOL session already active"
# Someone else has a session, or a stale session exists
ipmitool -I lanplus -H $BMC -U admin -P $PASS sol deactivate
ipmitool -I lanplus -H $BMC -U admin -P $PASS sol activate

# No output on SOL console
# Check: BIOS serial redirection enabled?
# Check: Linux kernel console parameter set?
# Check: getty running on ttyS0?
# From the host:
cat /proc/cmdline | grep console
# Should show: console=ttyS0,115200n8
systemctl status serial-getty@ttyS0

# Garbled output
# Baud rate mismatch — both sides must be 115200
ipmitool -I lanplus -H $BMC -U admin -P $PASS sol info 1
# Check: Volatile Bit Rate, Non-Volatile Bit Rate should be 115.2

# Set SOL baud rate
ipmitool -I lanplus -H $BMC -U admin -P $PASS sol set volatile-bit-rate 115.2 1

# SOL payload disabled
ipmitool -I lanplus -H $BMC -U admin -P $PASS sol set enabled true 1
ipmitool -I lanplus -H $BMC -U admin -P $PASS sol payload enable 1 2

Capturing SOL Output to a File

# Log SOL output while interacting (useful for capturing boot sequences)
ipmitool -I lanplus -H $BMC -U admin -P $PASS sol activate 2>&1 | tee /tmp/sol-$BMC.log

# Automated: use screen or script for unattended capture
script -c "ipmitool -I lanplus -H $BMC -U admin -P $PASS sol activate" /tmp/sol-capture.log

6. Chassis Status and Identification

# Chassis status (power, fault LEDs, intrusion)
ipmitool -I lanplus -H $BMC -U admin -P $PASS chassis status
# System Power         : on
# Power Overload       : false
# Power Interlock      : inactive
# Main Power Fault     : false
# Power Control Fault  : false
# Power Restore Policy : always-on
# Last Power Event     : command
# Chassis Intrusion    : inactive
# Front-Panel Lockout  : inactive
# Drive Fault          : false
# Cooling/Fan Fault    : true     ← fan problem!

# Blink the chassis identify LED (find a server in a rack)
ipmitool -I lanplus -H $BMC -U admin -P $PASS chassis identify 30
# LED blinks for 30 seconds

# Turn off identify LED
ipmitool -I lanplus -H $BMC -U admin -P $PASS chassis identify 0

# Indefinite blink (until you turn it off)
ipmitool -I lanplus -H $BMC -U admin -P $PASS chassis identify force

7. BMC Network Diagnostics

# View BMC network config
ipmitool -I lanplus -H $BMC -U admin -P $PASS lan print 1
# IP Address Source       : Static Address
# IP Address              : 10.0.10.5
# Subnet Mask             : 255.255.255.0
# Default Gateway IP      : 10.0.10.1
# MAC Address             : 00:11:22:33:44:55
# VLAN ID                 : 100
# Cipher Suite Priv Max   : aXXXXXXXXXXXXXX

# In-band BMC network check (from the host OS)
ipmitool lan print 1

# Test BMC connectivity from another machine
# IPMI uses UDP 623
nmap -sU -p 623 10.0.10.5
# 623/udp open  asf-rmcp

# If BMC is unreachable, check from the host (in-band)
ipmitool mc info          # BMC alive?
ipmitool lan print 1      # Check IP/VLAN config
ipmitool channel info 1   # Check channel settings

8. Power Consumption and DCMI

# Current power draw
ipmitool -I lanplus -H $BMC -U admin -P $PASS dcmi power reading
# Instantaneous power reading:   287 Watts
# Minimum during sampling period: 245 Watts
# Maximum during sampling period: 342 Watts
# Average power reading over sample period: 278 Watts
# Sampling period:                 1000 ms

# Power capping (limit max draw for capacity planning)
ipmitool -I lanplus -H $BMC -U admin -P $PASS dcmi power set_limit action 1 limit 350
ipmitool -I lanplus -H $BMC -U admin -P $PASS dcmi power activate

# Check cap status
ipmitool -I lanplus -H $BMC -U admin -P $PASS dcmi power get_limit

# Remove power cap
ipmitool -I lanplus -H $BMC -U admin -P $PASS dcmi power deactivate

9. User and Access Management

# List all BMC users
ipmitool -I lanplus -H $BMC -U admin -P $PASS user list 1
# ID  Name             Callin  Link Auth  IPMI Msg  Channel Priv Limit
# 1                    false   false      false     Unknown (0x00)
# 2   admin            false   false      true      ADMINISTRATOR
# 3   monitor          false   false      true      USER

# Create a monitoring user (read-only)
ipmitool -I lanplus -H $BMC -U admin -P $PASS user set name 3 monitor
ipmitool -I lanplus -H $BMC -U admin -P $PASS user set password 3
ipmitool -I lanplus -H $BMC -U admin -P $PASS channel setaccess 1 3 callin=on link=on ipmi=on privilege=2
ipmitool -I lanplus -H $BMC -U admin -P $PASS user enable 3

# Change password for existing user
ipmitool -I lanplus -H $BMC -U admin -P $PASS user set password 2
# (interactive prompt — password not visible in shell history)

# Non-interactive password set (careful — visible in process list)
ipmitool -I lanplus -H $BMC -U admin -P $PASS user set password 2 "N3wP@ssw0rd!"

# Check authentication capabilities
ipmitool -I lanplus -H $BMC -U admin -P $PASS channel getauthcap 1 4

10. Fleet Operations

Batch Sensor Check Across Multiple BMCs

#!/bin/bash
# check-fleet-temps.sh — alert on high inlet temperatures
THRESHOLD=40

while read -r bmc hostname; do
    temp=$(ipmitool -I lanplus -H "$bmc" -U monitor -P "$PASS" \
        sdr type Temperature 2>/dev/null | grep "Inlet" | awk -F'|' '{print $2}' | tr -dc '0-9.')
    if [ -n "$temp" ] && (( $(echo "$temp > $THRESHOLD" | bc -l) )); then
        echo "WARNING: $hostname ($bmc) inlet temp: ${temp}°C"
    fi
done < bmc_inventory.tsv

Batch Power Status

#!/bin/bash
# power-status.sh — check power state across fleet
while read -r bmc hostname; do
    status=$(ipmitool -I lanplus -H "$bmc" -U monitor -P "$PASS" \
        power status 2>/dev/null || echo "UNREACHABLE")
    echo "$hostname ($bmc): $status"
done < bmc_inventory.tsv

Batch SEL Collection

#!/bin/bash
# collect-sel.sh — archive SEL from all BMCs
ARCHIVE_DIR="/var/log/sel-archive/$(date +%F)"
mkdir -p "$ARCHIVE_DIR"

while read -r bmc hostname; do
    ipmitool -I lanplus -H "$bmc" -U monitor -P "$PASS" \
        sel elist > "$ARCHIVE_DIR/${hostname}.log" 2>/dev/null
    echo "Collected SEL from $hostname"
done < bmc_inventory.tsv

11. Diagnostic Workflows

Server Won't Boot — IPMI Triage

1. ipmitool power status           Is it on?
2. ipmitool chassis status         Any fault flags?
3. ipmitool sel elist | tail -10   Recent hardware events?
4. ipmitool sensor list | grep -i temp   Thermal issue?
5. ipmitool sensor list | grep -i fan    Dead fan?
6. ipmitool sol activate           What does the console show?
7. ipmitool chassis bootdev bios   Can we get into BIOS setup?

Thermal Event Investigation

# 1. Check current temperatures
ipmitool sensor list | grep -i temp
# Inlet Temp       | 44.000     | degrees C  | nc    ← non-critical warning

# 2. Check fans
ipmitool sensor list | grep -i fan
# Fan1             | 9000.000   | RPM        | ok
# Fan2             | 8880.000   | RPM        | ok
# Fan3             | 0.000      | RPM        | cr    ← dead fan

# 3. Check SEL for thermal events
ipmitool sel elist | grep -i "temp\|fan\|therm"
# 45 | 03/14/2024 | Fan #0x30 | Lower Critical going low | Asserted
# 46 | 03/14/2024 | Temperature #0x04 | Upper Non-Critical going high | Asserted

# 4. Check power consumption (high power = more heat)
ipmitool dcmi power reading

# Conclusion: Fan3 is dead → inlet temp rising → schedule replacement

PSU Failure Investigation

# 1. Check PSU sensors
ipmitool sdr type "Power Supply"
# PSU1 Status      | 0x01              | ok
# PSU2 Status      | 0x09              | cr

# 2. Check SEL for PSU events
ipmitool sel elist | grep -i "power supply"
# 50 | 03/14/2024 | Power Supply #0x51 | Failure detected | Asserted
# 51 | 03/14/2024 | Power Supply #0x51 | Power Supply AC lost | Asserted

# 3. Check chassis status for redundancy
ipmitool chassis status
# Power Restore Policy : always-on
# Main Power Fault     : false    ← system still running on PSU1

# 4. Check power draw on remaining PSU
ipmitool dcmi power reading
# 287 Watts on single PSU — within capacity? Check PSU rating.

# Action: server is running but not redundant. Schedule PSU2 replacement.

12. Environment Variable Pattern

Stop typing credentials every time:

# Set once per session
export IPMI_HOST=10.0.10.5
export IPMI_USER=admin
export IPMI_PASS=secret   # or use -E flag to read from env

# ipmitool supports -E to read password from IPMI_PASSWORD env var
export IPMI_PASSWORD=secret
ipmitool -I lanplus -H $IPMI_HOST -U $IPMI_USER -E power status

# Or create a shell alias
alias ipmi='ipmitool -I lanplus -H $IPMI_HOST -U $IPMI_USER -E'
ipmi power status
ipmi sensor list
ipmi sel elist

Quick Reference

Task Command
Power status ipmitool power status
Graceful shutdown ipmitool power soft
Hard power cycle ipmitool power cycle
All sensors ipmitool sdr list
Temperature sensors ipmitool sdr type Temperature
Fan sensors ipmitool sdr type Fan
Sensor detail ipmitool sensor get "Sensor Name"
Event log ipmitool sel elist
SEL info ipmitool sel info
Clear SEL ipmitool sel clear
SOL console ipmitool sol activate
SOL disconnect ~.
Boot from PXE ipmitool chassis bootdev pxe
Boot from disk ipmitool chassis bootdev disk
Enter BIOS ipmitool chassis bootdev bios
Chassis status ipmitool chassis status
Identify LED ipmitool chassis identify 30
BMC network ipmitool lan print 1
BMC info ipmitool mc info
BMC reset ipmitool mc reset cold
Power draw ipmitool dcmi power reading
List users ipmitool user list 1
Set password ipmitool user set password <id>

All remote commands need: -I lanplus -H <bmc-ip> -U <user> -P <pass> (or -E for env var).