Linux - Foundations and Operations Guide¶
Scope: Modern Linux from boot to production operations - updated for systemd-era hosts, cgroup v2, nftables-era firewalling, and current distro realities.
Topics: Boot process, kernel, systemd, processes and signals, permissions, filesystems and storage, LVM, RAID, LUKS, memory, networking, DNS and NSS, nftables and iptables, SSH, /proc, strace, performance triage, logging, packages, text processing, cgroups and namespaces, hardening, eBPF, distro differences, on-call triage, drills, cheat sheet.
Level: L0-L2 (zero -> foundations -> operations)
What this guide is and is not: - This is a practical Linux foundations and operations guide. - It favors accurate mental models and field-useful commands over trivia and vendor marketing. - It is broad, but it is not magic. Some areas still deserve dedicated deep dives: storage recovery, advanced networking, SELinux policy authoring, kernel internals, and performance analysis at scale.
The Mission¶
A rack server powers on in a datacenter. In under a minute it goes from dead silicon to firmware, bootloader, kernel, initramfs, PID 1, services, sockets, filesystems, and a login prompt. Later you SSH in, restart a service, inspect logs, and fix a production issue. Linux is the stack connecting all of that.
The goal here is not to turn you into a command parrot. The goal is to make the machine legible.
Table of Contents¶
- The Boot Sequence
- The Kernel
- systemd
- Processes and Signals
- Users, Permissions, ACLs, and Capabilities
- The Filesystem
- Storage - Partitions, LVM, RAID, LUKS
- Memory Management
- Networking Fundamentals
- Firewalls - nftables First, iptables Legacy
- SSH
- The
/procFilesystem - Debugging with
strace - Performance Triage
- Logging
- Package Management
- Text Processing
- cgroups and Namespaces
- Security Hardening
- eBPF
- Linux Distributions
- On-Call Survival Guide
- Real-World Case Studies
- Glossary
- Flashcards
- Drills
- Cheat Sheet
- Self-Assessment
Part 1: The Boot Sequence¶
You press the power button. Here is the practical version of what happens.
Stage 1: Firmware - BIOS or UEFI¶
The power supply stabilizes and emits a Power Good signal. The CPU starts executing from a fixed reset vector. At that instant there is no mounted disk, no userspace, no shell, and no kernel scheduler yet.
Legacy BIOS path:
Modern UEFI path:
| Feature | BIOS | UEFI |
|---|---|---|
| Partition table | MBR | GPT |
| Practical disk limit | ~2 TB with classic MBR | effectively enormous |
| Boot environment | 16-bit constraints | 32/64-bit firmware environment |
| Secure Boot | No | Yes |
| Bootloader location | MBR + post-MBR tricks | EFI binary on the ESP |
Secure Boot, in the real world:
- Firmware validates shim against keys in firmware.
- shim validates GRUB or MokManager.
- GRUB validates and loads the signed kernel.
- The kernel enforces signature rules for loadable modules.
- Initrd/initramfs images are commonly not part of that same validation chain, so do not imagine Secure Boot as a perfectly sealed steel coffin.
Stage 2: Bootloader - usually GRUB¶
GRUB is a tiny operating system whose job is to locate the kernel, hand it a command line, and usually provide a boot menu.
Useful kernel parameters:
| Parameter | Purpose |
|---|---|
root=UUID=... |
real root filesystem |
ro |
mount root read-only first |
systemd.unit=rescue.target |
rescue target |
single or 1 |
traditional single-user shorthand |
rd.break |
break into initramfs shell |
init=/bin/bash |
bypass normal init entirely |
console=ttyS0,115200 |
serial console |
Do not edit generated GRUB config directly.
- Debian/Ubuntu habit: edit /etc/default/grub and files in /etc/grub.d/, then run update-grub.
- RHEL-family habit: use grub2-mkconfig, grubby, and distro-specific bootloader paths.
- Cross-distro advice that says only update-grub is the answer is Debian-brained provincialism.
Stage 3: Kernel Initialization¶
The compressed kernel image decompresses and then: 1. sets up CPU mode and early memory management 2. builds page tables 3. initializes interrupt handling 4. probes buses and devices 5. initializes built-in drivers 6. mounts the initramfs as the temporary early root
Stage 4: Initramfs - the bridge to the real root¶
The kernel still needs enough tooling to find the real root filesystem. That might require storage drivers, RAID assembly, LUKS unlock, LVM activation, or network boot logic.
initramfs in RAM
├── /init
├── busybox or dracut tools
├── kernel modules
└── scripts to find and mount the real root
Failure here usually looks like: cannot find root device, dropped to emergency shell, or plain panic.
Common reasons: - wrong UUID on kernel command line - missing storage driver - broken RAID/LVM/LUKS setup - stale initramfs after controller or kernel changes
Rebuild commands vary:
Stage 5: PID 1 takes over¶
The kernel executes the configured init binary, almost always systemd now.
PID 1 is special: - it is the ultimate parent of orphaned processes - if it exits, the kernel panics - signal semantics around PID 1 are special
Part 2: The Kernel¶
What Linux actually is¶
Linux is the kernel, not the whole operating system.
Everything from bash to nginx to systemd is userspace. The kernel mediates access to CPU, memory, filesystems, devices, and networking.
Key kernel concepts¶
Syscalls are the contract boundary.
- file I/O: open, read, write, close
- process control: fork, execve, wait4
- networking: socket, connect, accept
- memory: mmap, mprotect, brk
Modules are loadable kernel components.
Kernel logs are where hardware truth often leaks out.
sysctl exposes runtime kernel tuning.
Practical rule: do not cargo-cult random sysctl snippets from the internet. A lot of them are cargo cult fossils from 2012 or break container, VPN, or routing behavior.
Part 3: systemd¶
systemd is the init system and service manager on most modern Linux distributions. It replaced linear shell-script boot with dependency-aware service management, supervision, logging integration, resource control, timers, sockets, and more.
Essential commands¶
systemctl status nginx
systemctl start nginx
systemctl stop nginx
systemctl restart nginx
systemctl reload nginx
systemctl enable nginx
systemctl disable nginx
systemctl enable --now nginx
systemctl list-units --failed
systemctl list-timers
systemctl daemon-reload
Units that matter most¶
| Unit type | Purpose |
|---|---|
service |
long-running daemon or one-shot task |
socket |
socket activation |
timer |
scheduled task |
mount / automount |
filesystem mounts |
target |
grouping / boot milestone |
path |
trigger on file path events |
slice |
cgroup-based resource grouping |
scope |
externally created process group |
A sane service file¶
# /etc/systemd/system/myapp.service
[Unit]
Description=My Application
After=postgresql.service
Wants=postgresql.service
# Only add these if the app is a *client* that truly requires working network before start
Wants=network-online.target
After=network-online.target
[Service]
Type=simple
User=deploy
Group=deploy
WorkingDirectory=/opt/myapp
ExecStart=/opt/myapp/bin/server --port 8080 --config /etc/myapp/config.yaml
Restart=on-failure
RestartSec=5
Environment=APP_ENV=production
MemoryMax=512M
CPUQuota=200%
LimitNOFILE=65536
NoNewPrivileges=yes
PrivateTmp=yes
ProtectSystem=strict
ProtectHome=read-only
ReadWritePaths=/var/lib/myapp /var/log/myapp
[Install]
WantedBy=multi-user.target
Dependency semantics that trip people¶
| Directive | Meaning |
|---|---|
After= |
ordering only |
Before= |
ordering only |
Wants= |
soft dependency |
Requires= |
hard dependency |
BindsTo= |
hard dependency with stronger lifecycle coupling |
PartOf= |
propagate restart/stop actions |
Big trap: network.target is not “the network is ready.” It mostly means networking stack startup has happened. Use network-online.target only for client software that actually must wait for configured connectivity. Most server daemons do not need it.
Drop-in overrides¶
Prefer overrides instead of editing packaged unit files.
Example:
Timers¶
Timers replace a lot of old cron use cases and integrate with service management.
# /etc/systemd/system/backup.timer
[Unit]
Description=Nightly backup
[Timer]
OnCalendar=*-*-* 02:00:00
Persistent=true
RandomizedDelaySec=5m
[Install]
WantedBy=timers.target
journald¶
journalctl -u nginx -f
journalctl -u nginx --since '1 hour ago'
journalctl -p err -b
journalctl -k
journalctl --disk-usage
journalctl --vacuum-size=500M
journalctl -o json-pretty -u nginx -n 1
Useful recovery targets¶
rescue.target tries to give you a usable single-user environment. emergency.target is even more minimal and rude.
Part 4: Processes and Signals¶
Process lifecycle¶
fork() -> child process created
execve() -> process image replaced with new program
wait() / waitpid() -> parent collects exit state
exit() -> process terminates
Every process has: - PID and PPID - credentials: UID, GID, groups - open file descriptors - memory mappings - cgroup membership - namespaces
Process states¶
| State | Meaning |
|---|---|
R |
runnable or running |
S |
interruptible sleep |
D |
uninterruptible sleep, often I/O wait |
T |
stopped |
Z |
zombie |
Zombies use almost no memory but they still consume PID table entries. Enough of them and fork() starts failing.
Signals¶
| Signal | Purpose |
|---|---|
SIGHUP |
reload by convention |
SIGINT |
interactive interrupt |
SIGQUIT |
quit + core by default |
SIGTERM |
graceful termination |
SIGKILL |
uncatchable kill |
SIGSTOP |
uncatchable stop |
SIGCONT |
continue |
SIGCHLD |
child state changed |
Operator rule:
1. inspect first
2. SIGTERM second
3. SIGKILL only when grace failed or the thing is obviously wedged
Part 5: Permissions¶
The base permission model¶
- file:
rread,wwrite,xexecute - directory:
rlist names,wcreate/delete entries,xtraverse
Special bits¶
| Bit | Meaning |
|---|---|
| SUID | execute as file owner |
| SGID | execute as file group or inherit group on directory |
| sticky | only owner can delete entries in directory |
umask¶
Common values:
- 0022 -> files 644, dirs 755
- 0002 -> files 664, dirs 775
- 0077 -> private by default
ACLs - when rwx is too blunt¶
Traditional mode bits are coarse. ACLs add per-user and per-group entries.
getfacl file
setfacl -m u:alice:r file
setfacl -m g:ops:rwX /srv/app
setfacl -d -m g:ops:rwX /srv/app
Use ACLs for shared directories and controlled exceptions. Do not turn them into a haunted forest of invisible permissions nobody remembers.
sudo and visudo¶
Do not hand-edit /etc/sudoers like a maniac with a flamethrower.
Prefer small files in /etc/sudoers.d/.
Example:
Linux capabilities¶
Root used to mean almost all power. Capabilities split that power into smaller pieces.
Examples:
- CAP_NET_BIND_SERVICE - bind to ports below 1024
- CAP_NET_ADMIN - network administration operations
- CAP_SYS_TIME - set system clock
- CAP_SYS_ADMIN - the kitchen-sink monster; avoid when possible
Inspect and set file capabilities:
Capabilities are great for least privilege. They are also a good way to create weird bugs if you do not understand effective, permitted, inheritable, and ambient sets.
MAC - SELinux and AppArmor¶
DAC says what the file owner and mode bits allow. MAC says what policy allows, regardless of owner intent.
- SELinux is label-based and powerful.
- AppArmor is path-based and usually easier to approach.
Quick checks:
# SELinux
getenforce
restorecon -Rv /var/www
ausearch -m avc -ts recent
# AppArmor
aa-status
apparmor_status
If a service gets EACCES but mode bits look fine, think MAC.
Part 6: The Filesystem¶
Everything is a file-ish thing¶
Regular files, directories, symlinks, block devices, character devices, sockets, pipes, procfs, sysfs - Linux represents a lot of system state through file-like interfaces.
Important paths¶
| Path | Purpose |
|---|---|
/ |
root |
/etc |
configuration |
/var |
variable data |
/home |
user homes |
/root |
root home |
/tmp |
temporary files |
/run |
runtime state, often tmpfs |
/proc |
process and kernel state |
/sys |
device and driver state |
/dev |
device nodes |
/boot |
kernel and bootloader assets |
/opt |
optional third-party software |
/srv |
site/service data |
Inodes¶
An inode stores metadata: ownership, mode, timestamps, size, block pointers, and more. Filenames live in directory entries, not inodes.
When df -h says there is space but writes still fail, check:
- df -i for inode exhaustion
- read-only remounts
- quotas
- deleted-open-file leaks
Hard links vs symlinks¶
- hard link -> same inode, same filesystem only
- symlink -> path reference, can cross filesystems, can dangle
VFS¶
The Virtual Filesystem layer lets the same syscalls work across ext4, XFS, tmpfs, NFS, overlayfs, procfs, and friends.
Common filesystem types¶
| Filesystem | Best use |
|---|---|
| ext4 | sane general-purpose default |
| XFS | big filesystems, high throughput, default on many RHEL systems |
| Btrfs | snapshots, checksums, compression, advanced features |
| tmpfs | RAM-backed temporary data |
| overlayfs | container layers |
| NFS | network file sharing |
Part 7: Storage¶
Block devices and partitions¶
LVM - storage virtualization that matters¶
pvcreate /dev/sdb1
vgcreate data /dev/sdb1
lvcreate -L 50G -n app data
mkfs.ext4 /dev/data/app
mount /dev/data/app /srv/app
Growth example:
lvextend -L +20G /dev/data/app
resize2fs /dev/data/app # ext4
xfs_growfs /srv/app # XFS uses mountpoint
Resize caveats worth tattooing on your frontal lobe¶
- ext4 can usually grow online; shrinking requires the filesystem to be unmounted.
- XFS growth is easy; shrinking is generally not a normal operation to rely on.
- Always understand the full stack: partition/LV size, then filesystem size, not just one layer.
- Backups first. Heroic confidence after coffee is not a backup strategy.
RAID levels¶
| RAID | Use |
|---|---|
| RAID 0 | speed, zero redundancy |
| RAID 1 | mirror |
| RAID 5 | one-disk parity, rebuild risk rises with size |
| RAID 6 | two-disk parity |
| RAID 10 | mirror + stripe, great practical default for important write-heavy workloads |
Software RAID basics¶
When an array is degraded: - performance often drops - risk during rebuild rises - do not celebrate because it is “still up” - watch SMART data and rebuild progress
Disk health¶
Mount options that matter¶
| Option | Use |
|---|---|
noexec |
block direct binary execution |
nosuid |
ignore SUID/SGID |
nodev |
ignore device nodes |
ro |
read-only |
noatime |
reduce access-time writes in some cases |
LUKS - disk encryption basics¶
LUKS is the common Linux standard for block-device encryption.
cryptsetup luksFormat /dev/sdb1
cryptsetup open /dev/sdb1 secure_data
mkfs.ext4 /dev/mapper/secure_data
Files involved:
- /etc/crypttab - what to unlock at boot
- initramfs - often required for encrypted root
Backup the LUKS header when appropriate. Lose it and your encrypted data may become modern art.
Part 8: Memory Management¶
Big picture¶
Linux tries to use RAM aggressively. File cache is good. Empty RAM is mostly wasted opportunity.
The field that usually matters most is MemAvailable, not MemFree.
Memory types¶
| Type | Meaning |
|---|---|
| anonymous | heap, stack, private mappings |
| page cache | cached file data |
| slab | kernel object caches |
| shared/tmpfs | shared pages |
| kernel memory | kernel code and data |
Virtual memory¶
Each process sees a virtual address space. The kernel maps that to physical memory. This gives isolation, lazy allocation, copy-on-write, and mmap-backed files.
Swap¶
Swap is not evil. Blindly disabling swap everywhere is meme-ops. But sustained swapping means pressure exists and you should understand why.
OOM killer¶
dmesg -T | grep -i 'oom\|killed process'
journalctl -k -g 'oom\|Killed process'
cat /proc/PID/oom_score
cat /proc/PID/oom_score_adj
Useful idea: - if the kernel is killing things, the argument is already over - now you are doing forensics, not philosophy
Memory triage¶
Part 9: Networking Fundamentals¶
Interfaces and addresses¶
Prefer ip over old ifconfig and route. The legacy commands still exist in many places, but the iproute2 tools are the modern interface.
DNS, NSS, and why dig is not the whole truth¶
There are multiple layers here:
- /etc/hosts
- /etc/nsswitch.conf
- libc resolver behavior
- systemd-resolved on many systems
- /etc/resolv.conf
- upstream DNS servers
So:
- dig example.com asks DNS directly.
- getent hosts example.com asks the system resolver path configured by NSS.
- those are not the same test.
getent hosts example.com
dig example.com +short
resolvectl status
resolvectl query example.com
cat /etc/nsswitch.conf
ls -l /etc/resolv.conf
If a host resolves with dig but not with getent, the problem may be NSS, search domains, systemd-resolved, or /etc/hosts, not raw DNS reachability.
/etc/resolv.conf realities¶
On systems using systemd-resolved, /etc/resolv.conf may be:
- a symlink to the stub resolver config using 127.0.0.53
- a symlink to a generated file listing upstream resolvers
- a static file managed by something else
Do not assume it is a normal hand-edited file anymore.
Connectivity tests¶
ping host
tracepath host
traceroute host
nc -zv host 443
curl -v telnet://host:443
ss -tlnp
tcpdump -i eth0 port 443
TCP states worth knowing¶
| State | Meaning | Common interpretation |
|---|---|---|
LISTEN |
waiting for inbound connections | normal for servers |
ESTAB |
connection active | normal |
TIME-WAIT |
recently closed | many short-lived connections |
CLOSE-WAIT |
peer closed, local side has not | application bug or leak |
SYN-SENT |
outbound connect in progress | upstream unreachable or filtered |
Many CLOSE-WAIT sockets usually mean your application is failing to close descriptors after the peer has gone away.
Bridges, bonds, VLANs - the one-screen version¶
- bridge - software L2 switch joining interfaces into one broadcast domain
- bond/team - combine multiple NICs for redundancy or aggregated bandwidth
- VLAN - isolate traffic at layer 2 using tagged networks
Quick examples:
If you work around virtualization, hypervisors, KVM, Proxmox, libvirt, or container hosts, bridges and VLANs stop being “advanced” and become Tuesday.
Policy routing and multiple tables¶
Sometimes the right route depends on source IP, mark, or interface. That is policy routing, not basic destination lookup.
If VPN, multihoming, or weird asymmetric paths are involved, look here.
Part 10: Firewalls¶
Linux firewalling today is nftables-first conceptually, even when older tools are still in circulation.
nftables mental model¶
- tables hold chains
- chains hold rules
- rules match packets and take actions
- one ruleset can cover IPv4 and IPv6 cleanly
Example host firewall:
Example config:
table inet filter {
chain input {
type filter hook input priority 0;
policy drop;
ct state established,related accept
iif lo accept
tcp dport { 22, 80, 443 } accept
ip protocol icmp accept
ip6 nexthdr icmpv6 accept
}
}
Apply safely:
iptables still matters¶
You will still see iptables because:
- old docs never die
- Docker, kube-proxy, fail2ban, and assorted tools still expose iptables-shaped behavior
- many distributions ship iptables compatibility frontends backed by nftables underneath
Useful commands:
Prefer conntrack syntax over the older state match when writing new iptables rules:
Firewalld and UFW¶
- firewalld is common on RHEL-family systems
- UFW is common on Ubuntu
- both are frontends, not the kernel firewall engine itself
Rule ordering still matters¶
Whether nftables or iptables, careless DROP rules can lock you out. Keep your current connection safe before you get clever.
Part 11: SSH¶
What happens on connect¶
- TCP connect to port 22
- key exchange
- host key verification
- user authentication
- channel/session setup
Host trust hygiene¶
SSH uses TOFU - trust on first use - unless you pre-seed trust another way.
Important files:
- ~/.ssh/known_hosts
- ~/.ssh/config
- ~/.ssh/id_ed25519
Key types¶
| Type | Advice |
|---|---|
| Ed25519 | preferred general default |
| RSA | legacy compatibility |
| ECDSA | acceptable |
| DSA | dead, leave it buried |
Config file¶
Host prod-*
User deploy
IdentityFile ~/.ssh/deploy_ed25519
ProxyJump bastion.example.com
Host db-primary
HostName 10.0.2.50
User postgres
Port 2222
Tunnels¶
ssh -L 8080:localhost:80 user@remote
ssh -R 8080:localhost:3000 user@remote
ssh -D 1080 user@remote
ssh -J bastion internal-host
Agent forwarding¶
Use sparingly. Root on the intermediate host can potentially abuse your forwarded agent. ProxyJump is often the cleaner answer.
Part 12: The /proc Filesystem¶
/proc is a virtual filesystem exposing kernel and process state.
Per-process inspection¶
cat /proc/$$/cmdline | tr '\0' ' '
ls -la /proc/$$/cwd
cat /proc/$$/environ | tr '\0' '\n' | head
cat /proc/$$/status
ls -la /proc/$$/fd
cat /proc/$$/maps
Secrets warning: environment variables are not a magical safe. Same-user or root access can often inspect them.
System-wide files¶
cat /proc/meminfo
cat /proc/cpuinfo
cat /proc/loadavg
cat /proc/uptime
cat /proc/sys/kernel/pid_max
cat /proc/net/tcp
Deleted open files¶
If a file is deleted but still open, disk space is not reclaimed until the process closes it or dies.
Part 13: Debugging with strace¶
strace shows syscalls. That is often enough to expose what a process is actually waiting on.
strace -p 12345
strace -p 12345 -t -T
strace -f ./deploy.sh
strace -e trace=file ./myapp
strace -e trace=network ./myapp
Patterns¶
Stuck process
Slow startup
Permission denied
strace is not subtle, but subtle is overrated at 3 AM.
Part 14: Performance Triage¶
USE method¶
For each resource, check: - Utilization - Saturation - Errors
| Resource | Utilization | Saturation | Errors |
|---|---|---|---|
| CPU | top, mpstat |
run queue | kernel or hardware complaints |
| Memory | free -h, vmstat |
swap, reclaim, OOM | OOM logs |
| Disk | iostat -xz |
await, queue depth |
I/O errors |
| Network | sar -n DEV, ip -s link |
drops, backlog, retransmits | driver/link errors |
Quick triage chain¶
uptime
free -h
df -h
df -i
dmesg -T | tail -30
iostat -xz 1 3
ss -s
ps -eo pid,ppid,%cpu,%mem,stat,cmd --sort=-%cpu | head
ps -eo pid,ppid,%cpu,%mem,stat,cmd --sort=-%mem | head
Load average¶
Load is runnable tasks plus tasks stuck in uninterruptible sleep, usually I/O.
High load with low CPU usage often means I/O pain, not CPU pain.
Part 15: Logging¶
Common places¶
/var/log/syslog or /var/log/messages
/var/log/auth.log or secure
/var/log/kern.log
application logs under /var/log/<app>/
journald essentials¶
journalctl -u nginx -f
journalctl -u nginx --since '1 hour ago'
journalctl -b -p err
journalctl -k
journalctl --disk-usage
journalctl --vacuum-size=500M
logrotate¶
Example:
/var/log/myapp/*.log {
daily
rotate 14
compress
delaycompress
missingok
notifempty
postrotate
systemctl reload myapp
endscript
}
Important modern nuance:
- on some systems log rotation is driven by cron
- on others it is driven by a systemd timer such as logrotate.timer
- do not assume cron is the scheduler without checking
Part 16: Package Management¶
Debian / Ubuntu¶
apt update
apt upgrade
apt install nginx
apt remove nginx
apt purge nginx
apt search nginx
apt-cache policy nginx
dpkg -l | grep nginx
dpkg -L nginx
RHEL / Fedora / Rocky / Alma¶
dnf install nginx
dnf upgrade
dnf info nginx
dnf remove nginx
dnf list installed | grep nginx
rpm -qa | grep nginx
rpm -ql nginx
Package hygiene¶
- prefer vendor packages or known repositories over random curl-pipe installers
- understand what created a file before you edit or delete it
- config drift and package ownership matter
Part 17: Text Processing¶
Pipeline mindset¶
Core tools¶
grep -r 'TODO' src/
grep -E 'error|warn' file
awk '{print $1}' file
awk -F: '{print $1,$7}' /etc/passwd
sed -n '10,20p' file
sed 's/old/new/g' file
sort -rn
uniq -c
cut -d: -f1 /etc/passwd
tr 'a-z' 'A-Z'
head -20 file
tail -f file
tee file
xargs
One nitpick worth keeping: avoid useless cat file | grep pattern when grep pattern file does the job. It is not a moral issue, just cleaner.
Part 18: cgroups and Namespaces¶
cgroup v2¶
Modern Linux increasingly means cgroup v2: a single unified hierarchy.
Useful checks:
Common files on cgroup v2 systems:
cat /sys/fs/cgroup/cgroup.controllers
cat /sys/fs/cgroup/memory.current
cat /sys/fs/cgroup/memory.max
cat /sys/fs/cgroup/cpu.max
Per-service example:
systemctl show nginx -p ControlGroup
CG=$(systemctl show nginx -p ControlGroup --value)
cat /sys/fs/cgroup${CG}/memory.current
Namespaces¶
| Namespace | Isolates |
|---|---|
| PID | process IDs |
| net | network stack |
| mount | mount table |
| UTS | hostname |
| user | UID/GID mappings |
| IPC | shared IPC objects |
| cgroup | cgroup view |
| time | time namespaces on supported systems |
Containers are mostly cgroups + namespaces + filesystem layering + runtime tooling.
Part 19: Security Hardening¶
SSH daemon baseline¶
PermitRootLogin no
PasswordAuthentication no
PubkeyAuthentication yes
AllowUsers deploy admin
MaxAuthTries 3
Least privilege¶
- run services as dedicated users
- use capabilities when a narrow privilege is enough
- use
sudoers.dinstead of handing out full root casually - restrict writable paths in systemd units
MAC controls¶
- SELinux: powerful label-based enforcement
- AppArmor: simpler path-based confinement on many Ubuntu systems
Firewall baseline¶
- default deny inbound unless host role says otherwise
- allow only required services
- document exceptions
- beware container/orchestrator interaction with host firewall rules
Kernel and sysctl hardening¶
Example baseline ideas:
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.tcp_syncookies = 1
kernel.dmesg_restrict = 1
fs.protected_hardlinks = 1
fs.protected_symlinks = 1
Patching and provenance¶
- keep the OS current
- know which repos you trust
- verify what owns a binary and where it came from
- avoid mystery curl scripts unless you have reviewed them
Auditing¶
Part 20: eBPF¶
eBPF lets you run verified sandboxed programs in the kernel for observability, networking, and security uses.
Examples:
bpftrace -e 'tracepoint:syscalls:sys_enter_openat { printf("%s %s\n", comm, str(args->filename)); }'
execsnoop
opensnoop
biolatency
tcpconnect
It is absurdly powerful. It is also not beginner-friendly when you leave the one-liner lane.
Part 21: Linux Distributions¶
| Family | Examples | Package tools | Common defaults |
|---|---|---|---|
| Debian | Debian, Ubuntu | apt, dpkg | ext4, AppArmor often on Ubuntu |
| Red Hat | RHEL, Rocky, Alma, Fedora | dnf, rpm | XFS common, SELinux strong |
| SUSE | SLES, openSUSE | zypper, rpm | Btrfs common on root |
| Arch | Arch, Endeavour | pacman | rolling release |
| Alpine | Alpine | apk | musl, small footprint |
Core Linux skills transfer. Packaging, defaults, release model, and support policies are where the families diverge.
Part 22: On-Call Survival Guide¶
Disk full¶
OOM¶
Service failed¶
systemctl status SERVICE
journalctl -u SERVICE -n 100 --no-pager
ss -tlnp | grep PORT
systemctl cat SERVICE
High load¶
Safe vs dangerous¶
| Usually safe | Usually dangerous |
|---|---|
| read logs and status | kill -9 on business-critical daemons |
| inspect sockets, pids, mounts | deleting unknown files under pressure |
| collect evidence | rebooting before you know what happened |
| journal vacuum with intent | docker system prune in anger |
Part 23: Real-World Case Studies¶
Case 1: OOM kills the app¶
Symptom: app dies, app logs say little.
Investigation: dmesg shows the kernel killed it. Heap or process memory budget assumed the host belonged entirely to one process.
Fix: reduce heap, add memory limits, add monitoring, leave headroom for kernel and cache.
Case 2: Disk “full” but df looks okay¶
Symptom: app still cannot write.
Investigation: df -i shows inode exhaustion or lsof +L1 shows giant deleted-open logs.
Fix: clean tiny-file storm or restart/rotate the offending process correctly.
Case 3: Zombie army¶
Symptom: fork() fails with EAGAIN.
Investigation: parent process is not reaping children. Zombies pile up.
Fix: fix the parent, restart it, or kill it so PID 1 adopts and reaps the zombies.
Case 4: Service flapping under systemd¶
Symptom: service restarts every few seconds and hits start limit.
Investigation: journalctl -u reveals bad config path, bad permissions, or missing dependency.
Fix: use absolute paths, correct WorkingDirectory, fix config, then systemctl reset-failed SERVICE.
Case 5: Logs eat root¶
Symptom: SSH slow, commands fail, temp files cannot be created.
Investigation: giant logs, failed rotation, or runaway debug mode.
Fix: truncate carefully if the file is open, repair rotation scheduling, consider separate /var.
Case 6: High load, low CPU¶
Symptom: load average huge, CPUs not pegged.
Investigation: iostat shows long await; tasks are stuck in I/O wait.
Fix: storage bottleneck, not CPU bottleneck. Different war, different tools.
Glossary¶
| Term | Meaning |
|---|---|
| kernel | core of the operating system |
| syscall | userspace entry into kernel services |
| PID 1 | init process, usually systemd |
| inode | file metadata record |
| file descriptor | numeric handle for open file/socket/pipe |
| page cache | RAM used for file caching |
| OOM killer | kernel logic that kills tasks under extreme memory exhaustion |
| cgroup | resource control grouping |
| namespace | isolation boundary |
| unit | systemd-managed object |
| target | systemd grouping / boot milestone |
| initramfs | temporary early root in RAM |
| ESP | EFI System Partition |
| LVM | Logical Volume Manager |
| LUKS | Linux block-device encryption standard |
| ACL | access control list |
| capability | fine-grained kernel privilege |
| MAC | mandatory access control |
| TOFU | trust on first use |
| NSS | Name Service Switch |
Flashcards¶
Boot and kernel¶
| Q | A |
|---|---|
| Boot chain in order? | firmware -> bootloader -> kernel -> initramfs -> PID 1 |
| What is initramfs for? | early userspace needed to reach the real root filesystem |
| What happens if PID 1 exits? | kernel panic |
network.target vs network-online.target? |
startup marker vs actual wait-for-network target |
Processes and permissions¶
| Q | A |
|---|---|
SIGTERM vs SIGKILL? |
graceful request vs uncatchable kill |
| What is a zombie? | exited process not yet reaped |
| Why use ACLs? | mode bits are too coarse for some sharing needs |
| Why use capabilities? | narrow privileges instead of full root |
Storage and memory¶
| Q | A |
|---|---|
| ext4 shrink? | possible, usually offline |
| XFS shrink? | not something to plan your life around |
MemFree or MemAvailable? |
MemAvailable |
What does lsof +L1 find? |
deleted files still open |
Networking and security¶
| Q | A |
|---|---|
dig vs getent hosts? |
raw DNS query vs system resolver/NSS path |
Many CLOSE-WAIT sockets mean? |
app is not closing connections |
| nftables or iptables first? | nftables first for modern mental model |
| What protects privileged port binding without full root? | CAP_NET_BIND_SERVICE |
Drills¶
Drill 1: Read the local boot and init path¶
Drill 2: Inspect a running process deeply¶
Drill 3: Compare DNS tools¶
Explain why the answers may differ.
Drill 4: Find deleted open files¶
Drill 5: Add a systemd override safely¶
Drill 6: Inspect cgroup v2 data for a service¶
CG=$(systemctl show ssh -p ControlGroup --value)
echo "$CG"
cat /sys/fs/cgroup${CG}/memory.current
cat /sys/fs/cgroup${CG}/cpu.stat
Drill 7: Check ACLs and capabilities¶
Drill 8: One-minute triage drill¶
Collect these with no commentary first:
Then write a three-sentence diagnosis hypothesis.
Cheat Sheet¶
Process and service control¶
ps aux
pstree -p
kill -TERM PID
kill -9 PID
systemctl status SERVICE
journalctl -u SERVICE -n 50 --no-pager
Disk and memory¶
Network and DNS¶
Firewall¶
Storage¶
Quick triage chain¶
Self-Assessment¶
- I can explain the boot chain without hand-waving.
- I understand the difference between the kernel and userspace.
- I know when
network-online.targetis appropriate and when it is not. - I can diagnose process states including zombies and
Dstate tasks. - I understand mode bits, ACLs, sudo, and capabilities.
- I know the practical difference between ext4 and XFS growth/shrink behavior.
- I can investigate DNS using both
digandgetent. - I can inspect a ruleset with nftables and still survive legacy iptables environments.
- I can use
/procandstraceto make a stuck process less mysterious. - I can perform a 60-second triage without immediately reaching for superstition.
Notes on Scope¶
This guide intentionally corrected and modernized several common Linux-teaching mistakes:
- it treats nftables as the modern firewall model, while still covering legacy iptables
- it treats cgroup v2 as the modern baseline
- it distinguishes network.target from network-online.target
- it separates DNS testing from system name-resolution testing
- it includes ACLs, capabilities, sudo hygiene, AppArmor/SELinux, LUKS, and storage resize caveats that broad “Linux complete guides” often skip
That makes it less flashy than a “one doc explains literally everything forever” claim, but far more trustworthy.
Verification Notes¶
Modernized sections in this revision were checked against current upstream or vendor documentation for these areas:
- systemd service ordering and network-online.target behavior
- cgroup v2 unified hierarchy
- nftables as the modern Netfilter framework and iptables compatibility layers
- distro-specific GRUB regeneration workflows
- Secure Boot chain details, including initrd nuance and module validation
- systemd-resolved, resolvectl, and /etc/resolv.conf modes
- AppArmor, SELinux, ACL, capability, and visudo behavior
- logrotate scheduling via systemd timers on modern systems
That does not make every sentence timeless. Linux changes. But it removes the obvious stale landmines from the prior draft.