DevOps & Linux Phone Interview Cheat Sheet¶
One-page-ish quick reference for phone screens. Skim before the call, keep on screen during.
Linux Essentials¶
Process & Systemd¶
ps aux | grep <name> # Find process
ps -eo pid,%cpu,%mem,cmd --sort=-%cpu | head # Top consumers
kill -15 PID # SIGTERM (graceful)
kill -9 PID # SIGKILL (last resort)
systemctl status/start/stop/restart <svc>
systemctl enable <svc> # Start on boot
journalctl -u <svc> -f # Follow logs
journalctl -p err -b # Errors since boot
Filesystem & Disk¶
df -h # Disk usage
df -i # Inode usage (hidden full-disk cause)
du -sh /var/log/* | sort -rh # Biggest dirs
lsblk # Block devices
mount | column -t # Mounts
find / -xdev -type f -size +100M # Large files
Memory & Performance¶
free -h # Memory overview
vmstat 1 5 # CPU/mem/swap snapshot
uptime # Load averages
iostat -xz 1 3 # Disk I/O
top -bn1 | head -20 # Quick process view
dmesg -T | tail # Kernel messages
USE Method — for each resource check Utilization, Saturation, Errors:
CPU: uptime → mpstat → pidstat
Memory: free → vmstat → slabtop
Disk: iostat → iotop → df -i
Network: sar -n DEV → ss → nstat
Networking¶
ss -tlnp # Listening TCP ports
ip addr show # IP addresses
ip route show # Routing table
dig example.com +short # DNS lookup
curl -v telnet://host:port # Test connectivity
traceroute host # Trace path
iptables -L -n -v # Firewall rules
File Permissions¶
rwxrwxrwx = user / group / other
755 = rwxr-xr-x (dirs, scripts)
644 = rw-r--r-- (config files)
SUID (u+s) — run as file owner
SGID (g+s) — inherit group on dir
Sticky (+t) — only owner can delete
Users & SSH¶
useradd -m -s /bin/bash user
usermod -aG sudo user
ssh-keygen -t ed25519
ssh -L 8080:localhost:80 user@remote # Local tunnel
ssh -R 8080:localhost:3000 user@remote # Remote tunnel
Package Management¶
# Debian/Ubuntu # RHEL/CentOS
apt update && apt upgrade dnf update
apt install <pkg> dnf install <pkg>
dpkg -l | grep <pkg> rpm -qa | grep <pkg>
Key Log Locations¶
Git (Quick Hits)¶
git status
git add -p # Stage hunks interactively
git commit -m "message"
git pull --rebase origin main # Linear history
git log --oneline -10
git stash / git stash pop
git revert HEAD # Safe undo on shared branch
git reset --soft HEAD~1 # Undo commit, keep changes
Docker¶
docker run -d --name app -p 8080:80 nginx:1.25
docker ps / docker ps -a
docker logs -f <container>
docker exec -it <container> bash
docker build -t myapp:v1 .
docker images / docker image prune -a
Dockerfile best practices: multi-stage builds, pin versions, non-root user, COPY not ADD, order layers by change frequency, use .dockerignore.
Kubernetes¶
Top 5 kubectl commands¶
kubectl get pods -A -o wide # What's running
kubectl describe pod <name> # Events & conditions
kubectl logs <pod> --tail=50 -f # Stream logs
kubectl exec -it <pod> -- bash # Shell in
kubectl apply -f manifest.yaml # Declarative apply
Resource chain¶
Probes¶
| Probe | Purpose | On Failure |
|---|---|---|
| liveness | Is the process alive? | Container restarted |
| readiness | Can it serve traffic? | Removed from Service |
| startup | Has it finished booting? | Delays other probes |
Service types¶
| Type | Scope |
|---|---|
| ClusterIP | Internal only (default) |
| NodePort | Expose on node port 30000-32767 |
| LoadBalancer | Cloud LB → NodePort → Pods |
Debugging flow¶
pod not starting? → kubectl describe pod (events)
pod crashing? → kubectl logs --previous
can't reach svc? → kubectl get endpoints (empty = selector mismatch)
slow/unhealthy? → check readiness probe, resource limits
Terraform¶
terraform init # Download providers
terraform plan # Dry run
terraform apply # Apply changes
terraform destroy # Tear down (careful!)
Key concepts: state file is source of truth, modules for reuse, plan -out=plan.tfplan for CI safety. HCL: resource, variable, output, data, module, locals.
CI/CD¶
CI = every commit triggers build + test (catch bugs early). CD = Delivery (every change is deployable) vs Deployment (auto-deployed).
Pipeline stages: lint → test → build → push artifact → deploy staging → deploy prod.
GitHub Actions key syntax:
on: push/pull_request
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: make test
Observability (The Three Pillars)¶
| Pillar | Tool | What |
|---|---|---|
| Metrics | Prometheus + Grafana | Numeric time-series (CPU, latency, error rate) |
| Logs | Loki / ELK / CloudWatch | Event records, grep-able |
| Traces | Jaeger / Tempo / Datadog | Request path across services |
Golden signals (from Google SRE): latency, traffic, errors, saturation.
SLI/SLO/SLA: - SLI = measurement (e.g. 99.2% of requests < 200ms) - SLO = target (e.g. 99.5% of requests < 200ms) - SLA = contract with consequences (e.g. refund if SLO missed)
Ansible (Quick Hits)¶
ansible-playbook -i inventory playbook.yml
ansible all -m ping # Connectivity check
ansible all -a "uptime" # Ad-hoc command
Networking Fundamentals¶
OSI layers to know: L3 (IP/routing), L4 (TCP/UDP), L7 (HTTP/DNS).
| Protocol | Port | Notes |
|---|---|---|
| SSH | 22 | Encrypted remote access |
| HTTP/HTTPS | 80/443 | Web traffic |
| DNS | 53 | Name resolution |
| SMTP | 25/587 |
TCP vs UDP: TCP = reliable, ordered, connection-based. UDP = fast, fire-and-forget (DNS, video, gaming).
DNS resolution: client → resolver cache → recursive resolver → root → TLD → authoritative.
Subnetting shorthand: /24 = 256 IPs, /16 = 65K, /8 = 16M. Private ranges: 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16.
Common Interview Questions — Quick Answers¶
"Walk me through what happens when you type a URL in a browser." DNS lookup → TCP handshake → TLS handshake → HTTP request → server processes → HTTP response → browser renders.
"How do you troubleshoot a slow Linux server?" USE method: uptime (CPU), free (memory), iostat (disk I/O), ss (network). Check dmesg, journalctl for errors. Identify top consumers with top/htop.
"Explain the difference between a container and a VM." VM = full OS + hypervisor, heavy, minutes to boot. Container = shares host kernel, isolated via namespaces + cgroups, lightweight, seconds to start.
"What is Infrastructure as Code?" Manage infra through version-controlled config files instead of manual changes. Benefits: repeatability, audit trail, collaboration, disaster recovery. Tools: Terraform, Ansible, Pulumi, CloudFormation.
"Describe a CI/CD pipeline you've built." Push → lint + unit tests → build Docker image → push to registry → deploy to staging → integration tests → promote to prod (canary or blue-green).