Linux Boot Process — Street Ops¶
Real-world operational scenarios for boot problems. These are the situations you'll face when a server won't come back up after a reboot, a kernel update goes sideways, or someone fat-fingers an fstab entry.
Recovering from Failed Boot: Rescue and Single-User Mode¶
Scenario: Server won't boot after a change, you need to fix it¶
Method 1: GRUB menu rescue (most common)
- At the GRUB menu, highlight the default entry and press
e - Find the
linuxline - For rescue mode, change the target:
- Append
systemd.unit=rescue.targetto thelinuxline - Or replace
quiet splashwithsingle - Press
Ctrl+Xto boot
# Rescue mode gives you a root shell with filesystems mounted
# You'll be prompted for root password (if set)
# If root password is unknown, use emergency mode instead:
# Append: systemd.unit=emergency.target
# This gives root shell without password on some distros
# Filesystems may be read-only — remount:
$ mount -o remount,rw /
Method 2: init=/bin/bash (bypass init entirely)
Append init=/bin/bash to the kernel command line in GRUB. This drops you to a bash shell as PID 1, before any services start:
# Root filesystem is read-only. Remount:
$ mount -o remount,rw /
# Make your fix (edit fstab, fix config, etc.)
$ vim /etc/fstab
# Sync and reboot
$ sync
$ reboot -f # Force reboot (normal reboot won't work without init)
Method 3: rd.break (break into initramfs)
Append rd.break to the kernel line. This pauses after initramfs loads but before it mounts the real root:
# Real root is mounted at /sysroot (read-only)
switch_root:/# mount -o remount,rw /sysroot
switch_root:/# chroot /sysroot
sh-5.1# passwd root # Reset root password
sh-5.1# touch /.autorelabel # If SELinux is enabled (RHEL)
sh-5.1# exit
switch_root:/# reboot
This method is essential for RHEL/CentOS when you need to reset the root password with SELinux enabled.
GRUB Repair¶
Scenario: GRUB is broken or missing — system drops to grub> or grub rescue>¶
If you get grub> prompt (full GRUB shell):
# List available partitions
grub> ls
(hd0) (hd0,gpt1) (hd0,gpt2) (hd0,gpt3)
# Find which partition has /boot
grub> ls (hd0,gpt2)/
boot/ etc/ home/ ...
# Set root and boot manually
grub> set root=(hd0,gpt2)
grub> linux /boot/vmlinuz-5.15.0-91-generic root=/dev/sda2
grub> initrd /boot/initrd.img-5.15.0-91-generic
grub> boot
If you get grub rescue> prompt (minimal shell, modules not loaded):
# Find the partition with GRUB modules
grub rescue> ls (hd0,gpt2)/boot/grub/
grub rescue> set prefix=(hd0,gpt2)/boot/grub
grub rescue> set root=(hd0,gpt2)
grub rescue> insmod normal
grub rescue> normal
# This should bring up the GRUB menu
Full GRUB reinstall from live USB:
# Boot from live USB, mount the system
$ sudo mount /dev/sda2 /mnt
$ sudo mount /dev/sda1 /mnt/boot/efi # If UEFI
$ sudo mount --bind /dev /mnt/dev
$ sudo mount --bind /proc /mnt/proc
$ sudo mount --bind /sys /mnt/sys
$ sudo mount --bind /run /mnt/run # Important for UEFI
$ sudo chroot /mnt
# Reinstall GRUB
# For BIOS:
$ grub-install /dev/sda
$ update-grub
# For UEFI:
$ grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=ubuntu
$ update-grub
$ exit
$ sudo umount -R /mnt
$ sudo reboot
initramfs Regeneration¶
Scenario: Boot fails because initramfs is missing drivers or is corrupt¶
Symptoms: kernel panic with "Unable to mount root fs," "VFS: Cannot open root device," or "no working init found."
Fix from rescue mode or live USB:
# After chrooting into the system (see GRUB repair steps)
# Debian/Ubuntu:
$ update-initramfs -u -k $(uname -r)
# If current kernel doesn't match, specify the version:
$ ls /lib/modules/
5.15.0-91-generic 5.15.0-92-generic
$ update-initramfs -u -k 5.15.0-92-generic
# RHEL/CentOS (dracut):
$ dracut -f /boot/initramfs-$(uname -r).img $(uname -r)
# Verbose mode to see what's included:
$ dracut -fv /boot/initramfs-5.15.0-92-generic.img 5.15.0-92-generic 2>&1 | tee /tmp/dracut.log
# Force include specific modules (e.g., if RAID driver is missing):
$ dracut -f --add-drivers "megaraid_sas" /boot/initramfs-$(uname -r).img
Scenario: initramfs was accidentally deleted¶
# If /boot/initrd.img-5.15.0-91-generic is gone:
# From rescue mode or live USB chroot:
$ update-initramfs -c -k 5.15.0-91-generic # Create new (not update)
# On RHEL:
$ dracut /boot/initramfs-5.15.0-91-generic.img 5.15.0-91-generic
Boot Performance Analysis¶
Scenario: System takes 90 seconds to boot, need to find the bottleneck¶
# Overall boot time breakdown
$ systemd-analyze
Startup finished in 3.2s (firmware) + 1.5s (loader) + 4.8s (kernel) + 82.1s (userspace) = 91.6s
graphical.target reached after 82.1s in userspace.
# Clearly the problem is in userspace. Find the culprits:
$ systemd-analyze blame | head -15
65.234s NetworkManager-wait-online.service
8.123s snapd.service
3.456s plymouth-quit-wait.service
2.345s docker.service
1.234s dev-sda2.device
...
# The critical chain shows dependencies:
$ systemd-analyze critical-chain
multi-user.target @82.1s
└─NetworkManager-wait-online.service @16.8s +65.2s
└─NetworkManager.service @14.5s +2.3s
└─dbus.service @12.1s +0.4s
└─basic.target @12.0s
└─sockets.target @12.0s
# NetworkManager-wait-online is the bottleneck.
# If not needed (server with static IP):
$ sudo systemctl disable NetworkManager-wait-online.service
# Or reduce timeout:
$ sudo mkdir -p /etc/systemd/system/NetworkManager-wait-online.service.d/
$ cat <<EOF | sudo tee /etc/systemd/system/NetworkManager-wait-online.service.d/timeout.conf
[Service]
ExecStart=
ExecStart=/usr/bin/nm-online -s -q --timeout=10
EOF
# Generate SVG boot chart for detailed visualization
$ systemd-analyze plot > /tmp/boot-chart.svg
Kernel Panic Troubleshooting¶
Scenario: System shows kernel panic during boot¶
Common kernel panic messages and their causes:
"VFS: Unable to mount root fs on unknown-block(0,0)"
- Root device not found. Wrong root= parameter, missing storage driver in initramfs, or hardware failure.
# Fix: verify root device at GRUB shell
grub> ls (hd0,gpt2)/
# If this works, the partition exists
# Check root= parameter matches
grub> cat (hd0,gpt2)/etc/fstab
# Find the UUID of the root partition
# Update the linux line with correct root=UUID=...
"Kernel panic - not syncing: No working init found"
- The kernel can't find /sbin/init, /etc/init, /bin/init, or /bin/sh
- Usually means initramfs is corrupt or root filesystem is damaged
# Boot with init=/bin/bash to verify filesystem
# If that works, rebuild initramfs
# If filesystem is damaged:
# Boot from live USB
$ sudo fsck -y /dev/sda2
"Kernel panic - not syncing: Attempted to kill init!" - PID 1 (init/systemd) crashed. Check for corrupt systemd binary or broken shared libraries.
# Boot with init=/bin/bash
# Check systemd binary
$ file /sbin/init # Should be ELF executable or symlink to systemd
$ ldd /lib/systemd/systemd # Check for missing libraries
# Reinstall systemd
# Debian: apt-get install --reinstall systemd
# RHEL: yum reinstall systemd
fsck on Boot Failure¶
Scenario: Boot drops to emergency shell because filesystem check failed¶
# Typical message:
# "Give root password for maintenance (or press Control-D to continue)"
# or: "You are in emergency mode"
# Check what failed:
$ journalctl -xb --no-pager | grep -i "fsck\|error\|fail"
# Run fsck manually (filesystem must be UNMOUNTED)
$ umount /dev/sda3 # If it's not root
$ fsck -y /dev/sda3 # Auto-fix errors
# For root filesystem, boot from live USB:
$ sudo fsck -y /dev/sda2
# If XFS:
$ sudo xfs_repair /dev/sda2
# If XFS repair fails:
$ sudo xfs_repair -L /dev/sda2 # Reset journal (data loss possible!)
# After fixing, reboot
$ reboot
Scenario: fstab entry has wrong UUID and system won't boot¶
# The system drops to emergency mode because a mount failed
# Check what failed
$ systemctl --failed
UNIT LOAD ACTIVE SUB DESCRIPTION
mnt-data.mount loaded failed failed Mount /mnt/data
# Check fstab
$ cat /etc/fstab
# There's a UUID that doesn't match any existing device
# Find correct UUIDs
$ blkid
/dev/sda1: UUID="abc123" TYPE="ext4"
/dev/sda2: UUID="def456" TYPE="ext4"
/dev/sdb1: UUID="789012" TYPE="xfs" # This is the correct UUID
# Fix fstab
$ vim /etc/fstab
# Update the UUID
# Test before rebooting!
$ mount -a
# If no errors, it's safe to reboot
Recovering from a Bad Kernel Update¶
Scenario: New kernel won't boot, need to roll back¶
At the GRUB menu:
- Select "Advanced options for Ubuntu" (or similar)
- Choose the previous kernel version
- System boots with the old kernel
To make the rollback permanent:
# Once booted on the old kernel:
$ uname -r
5.15.0-91-generic # Old (working) kernel
# Set GRUB to default to this kernel
$ grep menuentry /boot/grub/grub.cfg | head -10
# Find the exact menu entry string
# Or set by index (0 = first entry, typically newest)
$ sudo vim /etc/default/grub
GRUB_DEFAULT="1>2" # Submenu index 1, entry index 2 (count from 0)
# Regenerate GRUB config
$ sudo update-grub
# Optionally remove the broken kernel
$ sudo apt-get remove linux-image-5.15.0-92-generic # Debian
$ sudo dnf remove kernel-5.15.0-92.el9 # RHEL
Boot Logging and Forensics¶
Scenario: System rebooted unexpectedly, need to find out why¶
# Check journal from the previous boot
$ journalctl -b -1 --no-pager | tail -50
# Look specifically for the shutdown/crash
$ journalctl -b -1 -p crit
$ journalctl -b -1 --grep="panic|oom|segfault|watchdog"
# Check if it was an OOM kill
$ journalctl -b -1 -k --grep="oom\|killed process"
# Check for hardware errors
$ journalctl -b -1 -k --grep="hardware error\|mce\|GHES"
# Check for watchdog timeout
$ journalctl -b -1 -k --grep="watchdog\|NMI\|lockup"
# See last log entries before the reboot
$ journalctl -b -1 -n 100 --no-pager
# List all boots with their timestamps
$ journalctl --list-boots
-3 abc... Tue 2026-03-16 10:00:00 — Tue 2026-03-16 18:30:00
-2 def... Tue 2026-03-16 18:35:00 — Wed 2026-03-17 02:15:00 # Short uptime
-1 ghi... Wed 2026-03-17 02:20:00 — Wed 2026-03-18 14:00:00
0 jkl... Wed 2026-03-18 14:05:00 — present
# Check if it was a clean shutdown or crash
$ last -x reboot shutdown | head -10
reboot system boot 5.15.0-91-generic Wed Mar 18 14:05 still running
shutdown system down 5.15.0-91-generic Wed Mar 18 14:00 - 14:05 (00:05)
reboot system boot 5.15.0-91-generic Wed Mar 17 02:20 - 14:00 (1+11:40)
crash system down 5.15.0-91-generic Wed Mar 17 02:15 - 02:20 (00:05)
# "crash" indicates unclean shutdown
Filling /boot Partition — Emergency Cleanup¶
Scenario: /boot is full and you can't install kernel updates or regenerate initramfs¶
# Check /boot usage
$ df -h /boot
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 477M 470M 0 100% /boot
# List installed kernels
$ ls -la /boot/vmlinuz-*
$ dpkg --list 'linux-image-*' | grep '^ii' # Debian
$ rpm -qa kernel # RHEL
# Find current kernel (DO NOT REMOVE)
$ uname -r
5.15.0-91-generic
# Remove old kernels (keep current + one fallback)
# Debian/Ubuntu:
$ sudo apt-get purge linux-image-5.15.0-{85,86,87,88,89}-generic
$ sudo apt-get autoremove --purge
# RHEL/CentOS:
$ sudo dnf remove kernel-5.15.0-{85,86,87,88,89}.el9
# If apt won't run because /boot is full, manual cleanup:
$ sudo rm /boot/vmlinuz-5.15.0-85-generic
$ sudo rm /boot/initrd.img-5.15.0-85-generic
$ sudo rm /boot/System.map-5.15.0-85-generic
$ sudo rm /boot/config-5.15.0-85-generic
# Then run apt to clean up package state:
$ sudo apt-get -f install
$ sudo apt-get autoremove --purge
# Set kernel retention policy to prevent recurrence
# Debian (in /etc/apt/apt.conf.d/):
$ echo 'Unattended-Upgrade::Remove-Unused-Kernel-Packages "true";' | \
sudo tee /etc/apt/apt.conf.d/52-kernel-cleanup
# RHEL (in /etc/dnf/dnf.conf):
$ sudo grep installonly_limit /etc/dnf/dnf.conf
installonly_limit=3 # Keep only 3 kernel versions