Redfish -- Street Ops¶
Production workflows for managing servers via Redfish. Every pattern here solves a real operational problem.
Setup: Shell Variables¶
Set these once per session or source from your inventory:
BMC=10.0.10.5
CREDS="admin:password"
SYSTEM="/redfish/v1/Systems/System.Embedded.1"
MANAGER="/redfish/v1/Managers/iDRAC.Embedded.1"
CHASSIS="/redfish/v1/Chassis/System.Embedded.1"
# Helper function — saves typing
rf() {
curl -sk -u "$CREDS" "$@"
}
rfpost() {
curl -sk -u "$CREDS" -X POST -H 'Content-Type: application/json' "$@"
}
rfpatch() {
curl -sk -u "$CREDS" -X PATCH -H 'Content-Type: application/json' "$@"
}
1. Quick Health Check¶
The first thing you do when a server is acting up: get power state, health, and recent events.
# One-shot health summary
rf "https://$BMC$SYSTEM" | jq '{
PowerState, Health: .Status.Health,
Model, SerialNumber, BiosVersion,
CPUs: .ProcessorSummary.Count,
RAM_GB: (.MemorySummary.TotalSystemMemoryGiB)
}'
# Recent SEL events (last 10 non-OK)
rf "https://$BMC$MANAGER/LogServices/Sel/Entries" \
| jq '[.Members[] | select(.Severity != "OK")] | sort_by(.Created) | reverse | .[:10] | .[] | {Created, Message, Severity}'
# Thermal + power
rf "https://$BMC$CHASSIS/Thermal" \
| jq '[.Temperatures[] | select(.Status.Health != "OK") | {Name, ReadingCelsius, Health: .Status.Health}]'
rf "https://$BMC$CHASSIS/Power" \
| jq '{Watts: .PowerControl[0].PowerConsumedWatts, PSUs: [.PowerSupplies[] | {Name, Health: .Status.Health}]}'
2. Power Operations¶
# Check power state
rf "https://$BMC$SYSTEM" | jq '.PowerState'
# Graceful shutdown
rfpost "https://$BMC$SYSTEM/Actions/ComputerSystem.Reset" \
-d '{"ResetType": "GracefulShutdown"}'
# Force restart (when graceful fails)
rfpost "https://$BMC$SYSTEM/Actions/ComputerSystem.Reset" \
-d '{"ResetType": "ForceRestart"}'
# Power on
rfpost "https://$BMC$SYSTEM/Actions/ComputerSystem.Reset" \
-d '{"ResetType": "On"}'
Power Cycle Sequence for Unresponsive Server¶
1. GET power state — confirm it's "On"
2. POST GracefulShutdown — wait 60 seconds
3. GET power state — did it turn off?
4. If still on → POST ForceOff — wait 10 seconds
5. POST On
6. Watch SEL for POST errors
3. Fleet Inventory Script¶
Pull hardware inventory from multiple servers. Pipe into your CMDB or audit tool.
#!/bin/bash
# fleet_inventory.sh — pull inventory from a list of BMCs
CREDS="admin:password"
while read -r bmc; do
inventory=$(curl -sk -u "$CREDS" \
"https://$bmc/redfish/v1/Systems/System.Embedded.1" 2>/dev/null \
| jq -r '[.SerialNumber, .Model, .BiosVersion, .PowerState,
(.ProcessorSummary.Count | tostring),
(.MemorySummary.TotalSystemMemoryGiB | tostring)] | @tsv')
echo -e "$bmc\t$inventory"
done < bmc_inventory.txt
Output: TSV with BMC IP, serial, model, BIOS version, power state, CPU count, RAM.
4. Boot Override for Provisioning¶
# Set one-time PXE boot and restart
rfpatch "https://$BMC$SYSTEM" \
-d '{"Boot": {"BootSourceOverrideTarget": "Pxe", "BootSourceOverrideEnabled": "Once"}}'
rfpost "https://$BMC$SYSTEM/Actions/ComputerSystem.Reset" \
-d '{"ResetType": "ForceRestart"}'
Bulk PXE Boot¶
#!/bin/bash
# pxe_boot_fleet.sh — PXE boot a list of servers
CREDS="admin:password"
while read -r bmc; do
echo "Setting PXE boot: $bmc"
curl -sk -u "$CREDS" -X PATCH \
"https://$bmc/redfish/v1/Systems/System.Embedded.1" \
-H 'Content-Type: application/json' \
-d '{"Boot": {"BootSourceOverrideTarget": "Pxe", "BootSourceOverrideEnabled": "Once"}}' \
-o /dev/null -w "%{http_code}\n"
curl -sk -u "$CREDS" -X POST \
"https://$bmc/redfish/v1/Systems/System.Embedded.1/Actions/ComputerSystem.Reset" \
-H 'Content-Type: application/json' \
-d '{"ResetType": "ForceRestart"}' \
-o /dev/null -w "%{http_code}\n"
done < pxe_targets.txt
5. Credential Rotation¶
#!/bin/bash
# rotate_bmc_creds.sh — rotate BMC password across fleet
OLD_CREDS="admin:OldPass123"
NEW_PASS="NewStr0ngP@ss456"
while read -r bmc; do
# Find the admin account ID
acct_uri=$(curl -sk -u "$OLD_CREDS" \
"https://$bmc/redfish/v1/AccountService/Accounts" \
| jq -r '.Members[]."@odata.id"' | while read -r uri; do
username=$(curl -sk -u "$OLD_CREDS" "https://$bmc$uri" | jq -r '.UserName')
if [ "$username" = "admin" ]; then echo "$uri"; break; fi
done)
# Rotate
http_code=$(curl -sk -u "$OLD_CREDS" -X PATCH \
"https://$bmc$acct_uri" \
-H 'Content-Type: application/json' \
-d "{\"Password\": \"$NEW_PASS\"}" \
-o /dev/null -w "%{http_code}")
echo "$bmc: $http_code"
done < bmc_inventory.txt
6. Firmware Compliance Check¶
#!/bin/bash
# firmware_audit.sh — check firmware versions across fleet
CREDS="admin:password"
EXPECTED_BIOS="2.19.1"
EXPECTED_IDRAC="7.00.60.00"
while read -r bmc; do
bios=$(curl -sk -u "$CREDS" \
"https://$bmc/redfish/v1/Systems/System.Embedded.1" \
| jq -r '.BiosVersion')
idrac=$(curl -sk -u "$CREDS" \
"https://$bmc/redfish/v1/Managers/iDRAC.Embedded.1" \
| jq -r '.FirmwareVersion')
status="OK"
[ "$bios" != "$EXPECTED_BIOS" ] && status="BIOS_DRIFT($bios)"
[ "$idrac" != "$EXPECTED_IDRAC" ] && status="IDRAC_DRIFT($idrac) $status"
echo -e "$bmc\t$bios\t$idrac\t$status"
done < bmc_inventory.txt
7. SEL Export and Clear¶
Gotcha: The SEL (System Event Log) has a fixed size -- typically 512 to 2048 entries depending on vendor. When full, new events are silently dropped. Export and clear regularly, or you will miss the critical event that explains why the server crashed.
Always export before clearing. The SEL is finite.
# Export SEL to file
HOSTNAME=$(rf "https://$BMC$SYSTEM" | jq -r '.HostName // .SerialNumber')
rf "https://$BMC$MANAGER/LogServices/Sel/Entries" \
| jq '.Members[]' > "sel-${HOSTNAME}-$(date +%F).json"
# Clear SEL
rfpost "https://$BMC$MANAGER/LogServices/Sel/Actions/LogService.ClearLog"
8. Virtual Media for Remote Install¶
# Mount ISO
rfpost "https://$BMC$MANAGER/VirtualMedia/CD/Actions/VirtualMedia.InsertMedia" \
-d '{"Image": "https://iso-repo.internal/images/rhel9.3-boot.iso"}'
# Verify it's mounted
rf "https://$BMC$MANAGER/VirtualMedia/CD" | jq '{Image, Inserted}'
# Set boot to virtual CD and restart
rfpatch "https://$BMC$SYSTEM" \
-d '{"Boot": {"BootSourceOverrideTarget": "Cd", "BootSourceOverrideEnabled": "Once"}}'
rfpost "https://$BMC$SYSTEM/Actions/ComputerSystem.Reset" \
-d '{"ResetType": "ForceRestart"}'
# After install: eject
rfpost "https://$BMC$MANAGER/VirtualMedia/CD/Actions/VirtualMedia.EjectMedia"
9. BMC Reset¶
Remember: A BMC reset does NOT affect the host OS. The BMC is a separate computer on the motherboard with its own CPU, memory, and network stack. Restarting it is safe while the server is running -- but you will lose out-of-band access for 1-3 minutes during the restart.
When the BMC itself is misbehaving (stale sensors, hung web UI, SOL not connecting):
# Graceful BMC reset (does not affect host OS)
rfpost "https://$BMC$MANAGER/Actions/Manager.Reset" \
-d '{"ResetType": "GracefulRestart"}'
# BMC will be unreachable for 1-3 minutes during restart
10. Event Subscription Setup¶
Push hardware alerts to your monitoring stack instead of polling.
# Create webhook subscription
rfpost "https://$BMC/redfish/v1/EventService/Subscriptions" \
-d '{
"Destination": "https://alertmanager.internal:9093/api/v1/alerts",
"Protocol": "Redfish",
"EventTypes": ["Alert"],
"Context": "rack42-server05"
}'
# List subscriptions
rf "https://$BMC/redfish/v1/EventService/Subscriptions" \
| jq '.Members[]."@odata.id"'
# Delete a subscription
curl -sk -u "$CREDS" -X DELETE \
"https://$BMC/redfish/v1/EventService/Subscriptions/1"
11. Session Management¶
Debug clue: If your fleet scripts start getting
503 Service Unavailablefrom BMCs, you are likely hitting the session limit. Most BMCs allow only 4-6 concurrent sessions. Use session-based auth (below) with cleanup, or add-ubasic auth and avoid sessions entirely for one-shot commands.
For scripts making many requests, sessions reduce auth overhead.
# Create session, capture token and session URI
RESP=$(curl -sk -X POST "https://$BMC/redfish/v1/SessionService/Sessions" \
-H 'Content-Type: application/json' \
-d '{"UserName": "admin", "Password": "password"}' \
-D /dev/stdout -o /tmp/session_body.json)
TOKEN=$(echo "$RESP" | grep -i X-Auth-Token | awk '{print $2}' | tr -d '\r')
SESSION_URI=$(jq -r '."@odata.id"' /tmp/session_body.json)
# Use token for requests
curl -sk -H "X-Auth-Token: $TOKEN" "https://$BMC$SYSTEM" | jq '.PowerState'
# Clean up
curl -sk -X DELETE -H "X-Auth-Token: $TOKEN" "https://$BMC$SESSION_URI"
12. Redfish vs ipmitool — Quick Decision¶
| Task | Use Redfish | Use ipmitool |
|---|---|---|
| Power control | Yes | Fallback for pre-Redfish hardware |
| Sensor reading | Yes | Yes (still works, simpler for quick checks) |
| SEL read/clear | Yes | Yes |
| Firmware update | Yes (SimpleUpdate) | No |
| BIOS settings | Yes (Bios/Settings) | No |
| Boot override | Yes | Partial (chassis bootdev) |
| User management | Yes | Yes (but clunkier) |
| Virtual media | Yes | No |
| Serial console (SOL) | No standard Redfish equivalent | Yes — this is why ipmitool persists |
| Fleet scripting | Yes (REST + JSON = easy parsing) | Works but output parsing is fragile |
| Legacy hardware | No (not available) | Yes |