Skip to content

Redfish -- Street Ops

Production workflows for managing servers via Redfish. Every pattern here solves a real operational problem.

Setup: Shell Variables

Set these once per session or source from your inventory:

BMC=10.0.10.5
CREDS="admin:password"
SYSTEM="/redfish/v1/Systems/System.Embedded.1"
MANAGER="/redfish/v1/Managers/iDRAC.Embedded.1"
CHASSIS="/redfish/v1/Chassis/System.Embedded.1"

# Helper function — saves typing
rf() {
  curl -sk -u "$CREDS" "$@"
}

rfpost() {
  curl -sk -u "$CREDS" -X POST -H 'Content-Type: application/json' "$@"
}

rfpatch() {
  curl -sk -u "$CREDS" -X PATCH -H 'Content-Type: application/json' "$@"
}

1. Quick Health Check

The first thing you do when a server is acting up: get power state, health, and recent events.

# One-shot health summary
rf "https://$BMC$SYSTEM" | jq '{
  PowerState, Health: .Status.Health,
  Model, SerialNumber, BiosVersion,
  CPUs: .ProcessorSummary.Count,
  RAM_GB: (.MemorySummary.TotalSystemMemoryGiB)
}'

# Recent SEL events (last 10 non-OK)
rf "https://$BMC$MANAGER/LogServices/Sel/Entries" \
  | jq '[.Members[] | select(.Severity != "OK")] | sort_by(.Created) | reverse | .[:10] | .[] | {Created, Message, Severity}'

# Thermal + power
rf "https://$BMC$CHASSIS/Thermal" \
  | jq '[.Temperatures[] | select(.Status.Health != "OK") | {Name, ReadingCelsius, Health: .Status.Health}]'

rf "https://$BMC$CHASSIS/Power" \
  | jq '{Watts: .PowerControl[0].PowerConsumedWatts, PSUs: [.PowerSupplies[] | {Name, Health: .Status.Health}]}'

2. Power Operations

# Check power state
rf "https://$BMC$SYSTEM" | jq '.PowerState'

# Graceful shutdown
rfpost "https://$BMC$SYSTEM/Actions/ComputerSystem.Reset" \
  -d '{"ResetType": "GracefulShutdown"}'

# Force restart (when graceful fails)
rfpost "https://$BMC$SYSTEM/Actions/ComputerSystem.Reset" \
  -d '{"ResetType": "ForceRestart"}'

# Power on
rfpost "https://$BMC$SYSTEM/Actions/ComputerSystem.Reset" \
  -d '{"ResetType": "On"}'

Power Cycle Sequence for Unresponsive Server

1. GET power state — confirm it's "On"
2. POST GracefulShutdown — wait 60 seconds
3. GET power state — did it turn off?
4. If still on → POST ForceOff — wait 10 seconds
5. POST On
6. Watch SEL for POST errors

3. Fleet Inventory Script

Pull hardware inventory from multiple servers. Pipe into your CMDB or audit tool.

#!/bin/bash
# fleet_inventory.sh — pull inventory from a list of BMCs
CREDS="admin:password"

while read -r bmc; do
  inventory=$(curl -sk -u "$CREDS" \
    "https://$bmc/redfish/v1/Systems/System.Embedded.1" 2>/dev/null \
    | jq -r '[.SerialNumber, .Model, .BiosVersion, .PowerState,
              (.ProcessorSummary.Count | tostring),
              (.MemorySummary.TotalSystemMemoryGiB | tostring)] | @tsv')
  echo -e "$bmc\t$inventory"
done < bmc_inventory.txt

Output: TSV with BMC IP, serial, model, BIOS version, power state, CPU count, RAM.

4. Boot Override for Provisioning

# Set one-time PXE boot and restart
rfpatch "https://$BMC$SYSTEM" \
  -d '{"Boot": {"BootSourceOverrideTarget": "Pxe", "BootSourceOverrideEnabled": "Once"}}'

rfpost "https://$BMC$SYSTEM/Actions/ComputerSystem.Reset" \
  -d '{"ResetType": "ForceRestart"}'

Bulk PXE Boot

#!/bin/bash
# pxe_boot_fleet.sh — PXE boot a list of servers
CREDS="admin:password"

while read -r bmc; do
  echo "Setting PXE boot: $bmc"
  curl -sk -u "$CREDS" -X PATCH \
    "https://$bmc/redfish/v1/Systems/System.Embedded.1" \
    -H 'Content-Type: application/json' \
    -d '{"Boot": {"BootSourceOverrideTarget": "Pxe", "BootSourceOverrideEnabled": "Once"}}' \
    -o /dev/null -w "%{http_code}\n"

  curl -sk -u "$CREDS" -X POST \
    "https://$bmc/redfish/v1/Systems/System.Embedded.1/Actions/ComputerSystem.Reset" \
    -H 'Content-Type: application/json' \
    -d '{"ResetType": "ForceRestart"}' \
    -o /dev/null -w "%{http_code}\n"
done < pxe_targets.txt

5. Credential Rotation

#!/bin/bash
# rotate_bmc_creds.sh — rotate BMC password across fleet
OLD_CREDS="admin:OldPass123"
NEW_PASS="NewStr0ngP@ss456"

while read -r bmc; do
  # Find the admin account ID
  acct_uri=$(curl -sk -u "$OLD_CREDS" \
    "https://$bmc/redfish/v1/AccountService/Accounts" \
    | jq -r '.Members[]."@odata.id"' | while read -r uri; do
      username=$(curl -sk -u "$OLD_CREDS" "https://$bmc$uri" | jq -r '.UserName')
      if [ "$username" = "admin" ]; then echo "$uri"; break; fi
    done)

  # Rotate
  http_code=$(curl -sk -u "$OLD_CREDS" -X PATCH \
    "https://$bmc$acct_uri" \
    -H 'Content-Type: application/json' \
    -d "{\"Password\": \"$NEW_PASS\"}" \
    -o /dev/null -w "%{http_code}")

  echo "$bmc: $http_code"
done < bmc_inventory.txt

6. Firmware Compliance Check

#!/bin/bash
# firmware_audit.sh — check firmware versions across fleet
CREDS="admin:password"
EXPECTED_BIOS="2.19.1"
EXPECTED_IDRAC="7.00.60.00"

while read -r bmc; do
  bios=$(curl -sk -u "$CREDS" \
    "https://$bmc/redfish/v1/Systems/System.Embedded.1" \
    | jq -r '.BiosVersion')

  idrac=$(curl -sk -u "$CREDS" \
    "https://$bmc/redfish/v1/Managers/iDRAC.Embedded.1" \
    | jq -r '.FirmwareVersion')

  status="OK"
  [ "$bios" != "$EXPECTED_BIOS" ] && status="BIOS_DRIFT($bios)"
  [ "$idrac" != "$EXPECTED_IDRAC" ] && status="IDRAC_DRIFT($idrac) $status"

  echo -e "$bmc\t$bios\t$idrac\t$status"
done < bmc_inventory.txt

7. SEL Export and Clear

Gotcha: The SEL (System Event Log) has a fixed size -- typically 512 to 2048 entries depending on vendor. When full, new events are silently dropped. Export and clear regularly, or you will miss the critical event that explains why the server crashed.

Always export before clearing. The SEL is finite.

# Export SEL to file
HOSTNAME=$(rf "https://$BMC$SYSTEM" | jq -r '.HostName // .SerialNumber')
rf "https://$BMC$MANAGER/LogServices/Sel/Entries" \
  | jq '.Members[]' > "sel-${HOSTNAME}-$(date +%F).json"

# Clear SEL
rfpost "https://$BMC$MANAGER/LogServices/Sel/Actions/LogService.ClearLog"

8. Virtual Media for Remote Install

# Mount ISO
rfpost "https://$BMC$MANAGER/VirtualMedia/CD/Actions/VirtualMedia.InsertMedia" \
  -d '{"Image": "https://iso-repo.internal/images/rhel9.3-boot.iso"}'

# Verify it's mounted
rf "https://$BMC$MANAGER/VirtualMedia/CD" | jq '{Image, Inserted}'

# Set boot to virtual CD and restart
rfpatch "https://$BMC$SYSTEM" \
  -d '{"Boot": {"BootSourceOverrideTarget": "Cd", "BootSourceOverrideEnabled": "Once"}}'

rfpost "https://$BMC$SYSTEM/Actions/ComputerSystem.Reset" \
  -d '{"ResetType": "ForceRestart"}'

# After install: eject
rfpost "https://$BMC$MANAGER/VirtualMedia/CD/Actions/VirtualMedia.EjectMedia"

9. BMC Reset

Remember: A BMC reset does NOT affect the host OS. The BMC is a separate computer on the motherboard with its own CPU, memory, and network stack. Restarting it is safe while the server is running -- but you will lose out-of-band access for 1-3 minutes during the restart.

When the BMC itself is misbehaving (stale sensors, hung web UI, SOL not connecting):

# Graceful BMC reset (does not affect host OS)
rfpost "https://$BMC$MANAGER/Actions/Manager.Reset" \
  -d '{"ResetType": "GracefulRestart"}'

# BMC will be unreachable for 1-3 minutes during restart

10. Event Subscription Setup

Push hardware alerts to your monitoring stack instead of polling.

# Create webhook subscription
rfpost "https://$BMC/redfish/v1/EventService/Subscriptions" \
  -d '{
    "Destination": "https://alertmanager.internal:9093/api/v1/alerts",
    "Protocol": "Redfish",
    "EventTypes": ["Alert"],
    "Context": "rack42-server05"
  }'

# List subscriptions
rf "https://$BMC/redfish/v1/EventService/Subscriptions" \
  | jq '.Members[]."@odata.id"'

# Delete a subscription
curl -sk -u "$CREDS" -X DELETE \
  "https://$BMC/redfish/v1/EventService/Subscriptions/1"

11. Session Management

Debug clue: If your fleet scripts start getting 503 Service Unavailable from BMCs, you are likely hitting the session limit. Most BMCs allow only 4-6 concurrent sessions. Use session-based auth (below) with cleanup, or add -u basic auth and avoid sessions entirely for one-shot commands.

For scripts making many requests, sessions reduce auth overhead.

# Create session, capture token and session URI
RESP=$(curl -sk -X POST "https://$BMC/redfish/v1/SessionService/Sessions" \
  -H 'Content-Type: application/json' \
  -d '{"UserName": "admin", "Password": "password"}' \
  -D /dev/stdout -o /tmp/session_body.json)

TOKEN=$(echo "$RESP" | grep -i X-Auth-Token | awk '{print $2}' | tr -d '\r')
SESSION_URI=$(jq -r '."@odata.id"' /tmp/session_body.json)

# Use token for requests
curl -sk -H "X-Auth-Token: $TOKEN" "https://$BMC$SYSTEM" | jq '.PowerState'

# Clean up
curl -sk -X DELETE -H "X-Auth-Token: $TOKEN" "https://$BMC$SESSION_URI"

12. Redfish vs ipmitool — Quick Decision

Task Use Redfish Use ipmitool
Power control Yes Fallback for pre-Redfish hardware
Sensor reading Yes Yes (still works, simpler for quick checks)
SEL read/clear Yes Yes
Firmware update Yes (SimpleUpdate) No
BIOS settings Yes (Bios/Settings) No
Boot override Yes Partial (chassis bootdev)
User management Yes Yes (but clunkier)
Virtual media Yes No
Serial console (SOL) No standard Redfish equivalent Yes — this is why ipmitool persists
Fleet scripting Yes (REST + JSON = easy parsing) Works but output parsing is fragile
Legacy hardware No (not available) Yes