Containers Deep Dive - Street-Level Ops¶

Real debugging workflows for container problems you will actually hit in production.

Debugging Container Startup Failures¶

The container starts and immediately exits. docker ps shows nothing. This is the most common container problem.

# Step 1: Check if the container existed and exited
docker ps -a | grep <name>
# STATUS: Exited (1) 3 seconds ago

# Step 2: Read the logs
docker logs <container>
docker logs --tail 50 <container>

# Step 3: If logs are empty, the process crashed before writing anything
# Inspect the container config
docker inspect <container> | jq '.[0].Config'
# Check: Entrypoint, Cmd, Env, WorkingDir

# Step 4: Run interactively to see what happens
docker run -it --entrypoint /bin/sh <image>
# Now you're inside the container. Try running the entrypoint manually:
/path/to/entrypoint arg1 arg2
# You'll see the actual error

# Step 5: Check if it's a permissions issue
docker run -it --entrypoint /bin/sh <image>
ls -la /app/entrypoint.sh
# Is it executable? Does the USER have access?

# Step 6: Check if it's a missing dependency
docker run -it --entrypoint /bin/sh <image>
ldd /app/mybinary
# "not found" entries = missing shared libraries

Common startup failure causes¶

Symptom	Likely cause	Fix
Exit code 1	Application error	Read logs, run interactively
Exit code 126	Permission denied on entrypoint	`chmod +x`, check USER
Exit code 127	Entrypoint binary not found	Wrong path, missing binary, wrong base image
Exit code 137	OOMKilled (SIGKILL from kernel)	Increase memory limit or fix memory leak
Exit code 139	Segfault (SIGSEGV)	Binary incompatibility, missing libs
Exit code 143	SIGTERM (graceful shutdown)	Something sent SIGTERM immediately

Container Eating All Memory¶

The container gets OOMKilled. Kubernetes shows OOMKilled in pod status. Docker shows exit code 137.

# Step 1: Confirm the OOM kill
docker inspect <container> | jq '.[0].State'
# Look for "OOMKilled": true

# On Kubernetes
kubectl describe pod <pod> | grep -A5 "Last State"
# Reason: OOMKilled

# Step 2: Check what the memory limit was
docker inspect <container> | jq '.[0].HostConfig.Memory'
# Returns bytes. 0 = unlimited.

# Step 3: Check what the container was actually using before it died
# (Only works if you catch it before the kill)
docker stats <container> --no-stream
# MEM USAGE / LIMIT

# Step 4: Check cgroup memory stats
# Find the container's cgroup
CGPATH=$(docker inspect --format '{{.HostConfig.CgroupParent}}' <container>)
cat /sys/fs/cgroup/${CGPATH}/memory.current
cat /sys/fs/cgroup/${CGPATH}/memory.max
cat /sys/fs/cgroup/${CGPATH}/memory.events
# Look for "oom" and "oom_kill" counters

# Step 5: Check dmesg for kernel OOM messages
dmesg | grep -i "oom\|killed process"
# Shows which process was killed, its RSS, and the cgroup

# Step 6: Profile memory inside the container
docker exec -it <container> sh
# For Java: jmap -heap <pid>, jcmd <pid> GC.heap_info
# For Python: tracemalloc, memory_profiler
# For Go: pprof /debug/pprof/heap
# Generic: cat /proc/<pid>/smaps_rollup

Setting memory limits correctly¶

# Docker: set memory limit with swap disabled
docker run --memory=512m --memory-swap=512m myapp
# memory-swap = memory + swap. Setting equal to memory = no swap.

# Docker: set soft limit (reclaim pressure, no kill)
docker run --memory=512m --memory-reservation=256m myapp

# Kubernetes equivalent
resources:
  requests:
    memory: "256Mi"   # scheduling, soft guarantee
  limits:
    memory: "512Mi"   # hard limit, OOM kill above this

Debugging Network Connectivity Between Containers¶

Container A cannot reach container B. DNS fails, connections time out, or connections are refused.

# Step 1: Are they on the same network?
docker inspect <containerA> | jq '.[0].NetworkSettings.Networks'
docker inspect <containerB> | jq '.[0].NetworkSettings.Networks'
# If different networks, they can't talk (unless you connect them)

# Step 2: Can they resolve each other?
# DNS only works on user-defined networks, NOT the default bridge
docker exec <containerA> nslookup <containerB>
# If this fails: either wrong network or the target name is wrong

# Step 3: Can they ping?
docker exec <containerA> ping -c3 <containerB-ip>
# If ping works but the app doesn't, the service isn't listening

# Step 4: Is the service actually listening?
docker exec <containerB> ss -tlnp
# or: docker exec <containerB> netstat -tlnp
# Check: is the process bound to 0.0.0.0:<port> or 127.0.0.1:<port>?
# If 127.0.0.1, it only accepts local connections — change to 0.0.0.0

# Step 5: Check iptables rules (from host)
iptables -L -n -v | grep <container-ip>
iptables -t nat -L -n | grep <container-ip>

# Step 6: Packet capture
docker exec <containerA> tcpdump -i eth0 host <containerB-ip> -c 20
# Or from the host using nsenter:
PID=$(docker inspect --format '{{.State.Pid}}' <containerA>)
nsenter -t $PID -n tcpdump -i eth0 -c 20

Common network issues¶

Symptom	Cause	Fix
`nslookup` fails	Default bridge (no DNS)	Use a user-defined network
Connection refused	Service not running or wrong port	Check `ss -tlnp` inside container
Connection timeout	Wrong network, iptables blocking	Check network membership, iptables
Intermittent failures	DNS cache, container restarting (new IP)	Use service names, not IPs

Finding What's Inside a Running Container¶

The container has no shell, no tools, no package manager. You need to inspect it anyway.

# Method 1: docker exec (if shell exists)
docker exec -it <container> /bin/sh
docker exec -it <container> /bin/bash

# Method 2: docker exec with specific commands (no shell needed)
docker exec <container> cat /etc/os-release
docker exec <container> ls -la /app/
docker exec <container> env

# Method 3: nsenter from host (works even if container has no tools)
PID=$(docker inspect --format '{{.State.Pid}}' <container>)

# Enter all namespaces
nsenter -t $PID -m -u -i -n -p -- /bin/sh

# Enter just the filesystem namespace (browse container filesystem from host)
nsenter -t $PID -m -- ls -la /app/

# Enter just the network namespace (use host tools to debug container networking)
nsenter -t $PID -n -- ss -tlnp
nsenter -t $PID -n -- ip addr

# Method 4: Copy files out
docker cp <container>:/app/config.yaml ./config.yaml
docker cp <container>:/var/log/ ./container-logs/

# Method 5: Export the entire filesystem
docker export <container> -o container-fs.tar
tar -tf container-fs.tar | head -50

# Method 6: Inspect /proc from host
ls -la /proc/$PID/root/    # container's root filesystem
cat /proc/$PID/environ | tr '\0' '\n'   # environment variables
cat /proc/$PID/cmdline | tr '\0' ' '    # command line
ls -la /proc/$PID/fd/      # open file descriptors
cat /proc/$PID/maps        # memory mappings

Investigating Image Size¶

Your image is 2GB. CI takes forever. Deploys are slow. Time to figure out where the space went.

# Step 1: Check layer sizes
docker history <image>
# Shows each layer, the command that created it, and its size
# Largest layers are your targets

# Step 2: Use dive for interactive layer exploration
# Install: https://github.com/wagoodman/dive
dive <image>
# Shows: layer contents, wasted space, efficiency score
# Navigate layers and see exactly which files each added

# Step 3: Check for obvious waste
docker run --rm -it <image> du -sh /* 2>/dev/null | sort -rh | head -10
# Common culprits:
# /usr/lib/  — system libraries from full base image
# /var/cache/ — package manager cache not cleaned
# /root/.cache/ — pip/npm cache left behind
# /app/.git/ — git history copied into image

# Step 4: Analyze with skopeo (without pulling)
skopeo inspect docker://registry.example.com/myimage:latest
# Shows layers with sizes before you pull them

# Step 5: Common fixes
# Switch base image: debian → debian-slim → alpine → distroless → scratch
# Multi-stage build: build in one stage, copy only artifacts to final
# Clean up in same RUN layer: apt-get clean, rm -rf /var/lib/apt/lists/*
# Use .dockerignore: exclude .git, node_modules, __pycache__, tests
# Pin slim variants: python:3.11-slim instead of python:3.11

Size comparison of common base images¶

Image                          Compressed    Uncompressed
scratch                        0 MB          0 MB
gcr.io/distroless/static       ~2 MB         ~5 MB
alpine:3.19                    ~3 MB         ~7 MB
debian:bookworm-slim           ~25 MB        ~75 MB
ubuntu:24.04                   ~28 MB        ~78 MB
python:3.11-slim               ~45 MB        ~120 MB
python:3.11                    ~350 MB       ~900 MB
node:20                        ~360 MB       ~1 GB

Container Security Scanning¶

Find vulnerabilities before they hit production.

# Trivy — comprehensive scanner (CVEs, secrets, misconfig)
trivy image myapp:latest
trivy image --severity HIGH,CRITICAL myapp:latest
trivy image --ignore-unfixed myapp:latest   # only fixable CVEs
trivy fs /path/to/project                   # scan source code
trivy config .                              # scan IaC files

# Grype — Anchore's vulnerability scanner
grype myapp:latest
grype myapp:latest --only-fixed             # only fixable CVEs
grype myapp:latest -o json                  # JSON output for CI

# Scan in CI pipeline
# Exit code 1 if critical vulns found
trivy image --exit-code 1 --severity CRITICAL myapp:latest

# Scan a running container (not just the image)
docker export <container> | trivy image --input -

# Check for secrets in image layers
trivy image --scanners secret myapp:latest

# Compare scans over time
trivy image -f json -o scan-$(date +%F).json myapp:latest

Interpreting scan results¶

CVE-2024-1234 (CRITICAL) — openssl 3.0.1 → 3.0.13 (fixed)
  → Action: update base image or pin fixed version

CVE-2024-5678 (HIGH) — libcurl 7.88 (no fix available)
  → Action: evaluate if exploitable in your context. If not, suppress.

CVE-2024-9999 (MEDIUM) — zlib 1.2.11 → 1.2.13 (fixed)
  → Action: schedule for next base image update

Cleaning Up Disk Space¶

Docker disk usage grows silently. One day the build fails or the host runs out of space.

# Step 1: See what's using space
docker system df
# TYPE            TOTAL   ACTIVE   SIZE      RECLAIMABLE
# Images          45      5        12.5GB    10.2GB (81%)
# Containers      12      3        1.2GB     800MB (66%)
# Local Volumes   8       4        5.3GB     2.1GB (39%)
# Build Cache     -       -        3.8GB     3.8GB

# Step 2: Nuclear option — reclaim everything unused
docker system prune -a --volumes
# Removes: stopped containers, unused networks, dangling AND unused images,
# unused volumes, build cache
# WARNING: this removes ALL images not used by a running container

# Step 3: Targeted cleanup
# Remove dangling images (untagged, not used by any container)
docker image prune

# Remove all unused images (not just dangling)
docker image prune -a

# Remove stopped containers
docker container prune

# Remove unused volumes (WARNING: data loss if volume has useful data)
docker volume prune

# Remove build cache
docker builder prune --all

# Step 4: Find the biggest offenders
docker images --format '{{.Size}}\t{{.Repository}}:{{.Tag}}' | sort -rh | head -20
docker system df -v | head -40

# Step 5: Automated cleanup (cron or systemd timer)
# Clean images older than 24 hours, keep running containers' images
docker image prune -a --filter "until=24h"

# Step 6: Check Docker's data directory size
du -sh /var/lib/docker/
du -sh /var/lib/docker/overlay2/
du -sh /var/lib/docker/volumes/

Debugging Build Cache Misses¶

Your build is slow because layers keep rebuilding when you think they should be cached.

# Step 1: Build with progress output
docker build --progress=plain -t myapp .
# Look for "CACHED" vs "RUN" in the output
# The first non-cached layer and everything after it rebuilds

# Step 2: Common cache busters
# - COPY . . before RUN pip install (any file change invalidates)
# - ADD with a URL (always re-fetched)
# - ARG before a RUN that doesn't use it (ARG changes invalidation chain)
# - Different build context (even .git changes)
# - BuildKit cache key includes file permissions and ownership

# Step 3: Check file timestamps vs content hashing
# BuildKit uses content hashing (good). Legacy builder uses timestamps (bad).
# Ensure BuildKit is enabled:
DOCKER_BUILDKIT=1 docker build .

# Step 4: Use cache mounts to preserve package manager state
RUN --mount=type=cache,target=/root/.cache/pip \
    pip install -r requirements.txt
# pip cache persists between builds, even if the layer is invalidated

# Step 5: External cache for CI
docker build \
  --cache-from type=registry,ref=registry.example.com/myapp:cache \
  --cache-to type=registry,ref=registry.example.com/myapp:cache \
  -t myapp .

# Step 6: Debug what changed
# Export build context to see what you're sending
docker build -f Dockerfile --no-cache -t myapp . 2>&1 | grep "transferring context"
# If it's hundreds of MB, you need a better .dockerignore

Investigating OOMKilled Containers¶

The container was killed by the kernel's OOM killer. It did not crash on its own.

# Step 1: Confirm OOMKilled
docker inspect <container> | jq '.[0].State.OOMKilled'
# true

# Kubernetes
kubectl get pod <pod> -o jsonpath='{.status.containerStatuses[0].lastState}'
# reason: OOMKilled

# Step 2: Check kernel messages
dmesg | tail -50
# Look for: "Memory cgroup out of memory: Killed process <pid>"
# Shows: the exact process, its RSS, and the cgroup limit

# Step 3: Was the limit too low or is there a leak?
# Check the configured limit vs actual usage pattern
docker stats <container> --no-stream
# If MEM USAGE is steadily climbing toward LIMIT, it's a leak
# If USAGE spikes, it might be a burst workload needing more headroom

# Step 4: Profile memory in the next run
# Java
docker run -e JAVA_OPTS="-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp" myapp
# Then grab the heap dump: docker cp <container>:/tmp/*.hprof .

# Go
# Hit /debug/pprof/heap during normal operation to capture baseline
curl http://localhost:6060/debug/pprof/heap > heap.prof
go tool pprof heap.prof

# Python
# Use tracemalloc or memory_profiler
# Add to entrypoint: python -X tracemalloc=10 app.py

# Step 5: Check if it's kernel memory overhead
cat /sys/fs/cgroup/<path>/memory.stat
# Look for: slab, kernel_stack, sock, percpu
# These count against the cgroup limit but aren't visible in the app

Docker Daemon Troubleshooting¶

The Docker daemon is slow, unresponsive, or won't start.

# Step 1: Check daemon status
systemctl status docker
journalctl -u docker --since "10 minutes ago" --no-pager

# Step 2: Daemon won't start
# Check config syntax
cat /etc/docker/daemon.json | python3 -m json.tool
# Common issue: trailing comma in JSON, invalid option name

# Step 3: Daemon is slow
# Check how many containers exist (including stopped)
docker ps -aq | wc -l
# Hundreds of stopped containers slow everything down

# Check disk I/O
iostat -x 1 3
# If /var/lib/docker is on a slow disk, everything suffers

# Check if overlay2 is fragmented
du -sh /var/lib/docker/overlay2/ | sort -rh | head -5

# Step 4: Daemon is unresponsive (hung)
# Send SIGUSR1 to dump goroutine stacks
kill -USR1 $(pidof dockerd)
# Check journal for the stack dump
journalctl -u docker --since "1 minute ago" | grep -A 100 "goroutine"

# Step 5: Check daemon event log
docker events --since 10m
# Shows: container start/stop/die, image pull, network events

Migrating from Docker to containerd¶

Kubernetes dropped Docker support in v1.24. If your nodes still run Docker, you need to migrate to containerd (or CRI-O).

# Step 1: Check current runtime
kubectl get nodes -o wide
# CONTAINER-RUNTIME column shows docker:// or containerd://

# On the node:
crictl info
# Shows the runtime endpoint and version

# Step 2: Pre-migration checks
# List all images on the node (you'll need them after migration)
docker images --format '{{.Repository}}:{{.Tag}}'
crictl images

# Step 3: The migration process (per node)
# 1. Cordon the node (prevent new pods)
kubectl cordon <node>

# 2. Drain the node (evict existing pods)
kubectl drain <node> --ignore-daemonsets --delete-emptydir-data

# 3. Stop Docker, configure containerd
systemctl stop docker
# Edit /etc/containerd/config.toml
# Set: [plugins."io.containerd.grpc.v1.cri"] sandbox_image, ...

# 4. Update kubelet config
# Edit /var/lib/kubelet/kubeadm-flags.env or kubelet config
# Change --container-runtime-endpoint to containerd's socket
# unix:///run/containerd/containerd.sock

# 5. Restart kubelet
systemctl restart containerd
systemctl restart kubelet

# 6. Uncordon
kubectl uncordon <node>

# 7. Verify
kubectl get node <node> -o wide
# CONTAINER-RUNTIME should show containerd://

# Step 4: Post-migration
# Docker CLI no longer works for container management on the node
# Use crictl for node-level container inspection
# Use nerdctl if you want a Docker-like CLI for containerd

crictl vs docker cheat sheet¶

Docker command	crictl equivalent
`docker ps`	`crictl ps`
`docker images`	`crictl images`
`docker inspect <id>`	`crictl inspect <id>`
`docker logs <id>`	`crictl logs <id>`
`docker exec -it <id> sh`	`crictl exec -it <id> sh`
`docker pull <image>`	`crictl pull <image>`
`docker stop <id>`	`crictl stop <id>`
`docker rm <id>`	`crictl rm <id>`
`docker stats`	`crictl stats`
`docker info`	`crictl info`

Runtime Debugging Patterns¶

Pattern: Debugging CrashLoopBackOff¶

When a container keeps crashing, you can't exec into it:

# Method 1: Get previous logs
kubectl logs <pod> -n <ns> --previous

# Method 2: Copy the pod with a different command
kubectl debug <pod> -n <ns> --copy-to=debug-pod \
  --container=<container> -- sleep 3600
kubectl exec -it debug-pod -n <ns> -- sh

# Method 3: Use crictl from the node
kubectl debug node/<node> -it --image=busybox
chroot /host
CONTAINER=$(crictl ps -a --name <container> -q | head -1)
crictl logs $CONTAINER

Pattern: Network Debug from Container Context¶

# Use netshoot ephemeral container
kubectl debug -it pod/myapp -n ns --image=nicolaka/netshoot -- bash

# DNS resolution
nslookup service.namespace.svc.cluster.local
# TCP connectivity
curl -v http://service:8080/health
nc -zv postgres 5432
# Packet capture
tcpdump -i eth0 -n port 8000 -c 20

Emergency: Container Runtime Not Responding¶

# Check containerd status
systemctl status containerd
journalctl -u containerd --since "10 min ago" --no-pager

# Check disk space (containerd needs space for layers)
df -h /var/lib/containerd/

# Clean up unused images
crictl rmi --prune

# Restart containerd (last resort - restarts all containers on node)
systemctl restart containerd

Quick Reference¶

Cheatsheet: Container Runtime Debug