Container Base Images — Footguns & Pitfalls¶
1. Using :latest Tag in Production¶
# WRONG — "latest" today ≠ "latest" tomorrow
FROM python:latest
# RIGHT — pin the version
FROM python:3.12.4-slim-bookworm
# ALSO WRONG — floating tag
FROM python:3.12-slim
# Might get 3.12.4 today, 3.12.5 tomorrow, with breaking changes
Pin to the most specific tag you can tolerate. Rebuild intentionally.
2. Alpine + Python C Extensions = Pain¶
# Takes forever and might fail
FROM python:3.12-alpine
RUN pip install pandas numpy cryptography
# These all have C extensions that need compilation on Alpine
# No pre-built wheels exist for musl (only glibc)
# Build takes 10-30 minutes vs seconds on Debian
# Fix: use slim instead
FROM python:3.12-slim
RUN pip install pandas numpy cryptography # pre-built wheels, seconds
3. Running as Root¶
# WRONG — container runs as root by default
FROM python:3.12-slim
COPY . /app
CMD ["python", "app.py"] # running as root!
# RIGHT — create and switch to non-root user
FROM python:3.12-slim
RUN useradd -r -s /sbin/nologin appuser
COPY --chown=appuser:appuser . /app
USER appuser
CMD ["python", "app.py"]
Running as root in containers means a container escape = host root access.
Kubernetes runAsNonRoot: true will reject root containers.
4. apt-get update and install in Separate Layers¶
# WRONG — update layer gets cached, install uses stale index
RUN apt-get update
RUN apt-get install -y curl # might fail with "package not found"
# RIGHT — same layer
RUN apt-get update && apt-get install -y --no-install-recommends \
curl && rm -rf /var/lib/apt/lists/*
5. Leaving Build Tools in Production Image¶
# WRONG — gcc, make, headers all in production image
FROM python:3.12-slim
RUN apt-get update && apt-get install -y gcc libpq-dev
RUN pip install psycopg2
# gcc and headers are still here — +200MB of attack surface
# RIGHT — multi-stage build
FROM python:3.12-slim AS build
RUN apt-get update && apt-get install -y gcc libpq-dev
RUN pip install --prefix=/install psycopg2
FROM python:3.12-slim
RUN apt-get update && apt-get install -y --no-install-recommends libpq5 \
&& rm -rf /var/lib/apt/lists/*
COPY --from=build /install /usr/local
6. Distroless + Debugging Emergency¶
# Production is crashing, you need to debug, but...
kubectl exec -it pod/myapp -- sh
# Error: OCI runtime exec failed: exec failed: unable to start container
# process: exec: "sh": executable file not found
# There's NO SHELL in distroless. You can't exec in.
# Emergency options:
# 1. Deploy debug sidecar
kubectl debug -it pod/myapp --image=busybox
# 2. Swap to debug variant temporarily
# gcr.io/distroless/base-debian12:debug
# 3. Check logs (always available)
kubectl logs pod/myapp
Plan your debugging story BEFORE going distroless.
7. Ignoring .dockerignore¶
# Without .dockerignore, COPY . . sends everything to the daemon:
# .git/ — hundreds of MB
# node_modules/ — hundreds of MB
# .env — SECRETS!
# tests/ — unnecessary in production
# .venv/ — Python virtual env
# Result: bloated image, slow builds, leaked secrets
Always create a .dockerignore.
8. Alpine DNS in Kubernetes¶
Alpine's musl resolver doesn't handle ndots:5 (Kubernetes default) well:
- Queries all search domains for every DNS lookup
- Can cause 10x more DNS queries than Debian-based containers
- Under load, can overwhelm CoreDNS
Or just use Debian-slim and avoid the issue entirely.
9. Timezone Data Missing¶
# Alpine and distroless don't include timezone data
# time.LoadLocation("America/New_York") fails!
# Alpine fix:
RUN apk add --no-cache tzdata
# Debian-slim fix:
RUN apt-get update && apt-get install -y --no-install-recommends tzdata \
&& rm -rf /var/lib/apt/lists/*
# Set timezone:
ENV TZ=America/New_York
10. Assuming scratch = Secure¶
scratch is empty, but that doesn't mean your binary is secure: - Your Go/Rust binary still has CVEs in its dependencies - No CA certificates = HTTPS calls fail - No /tmp directory = apps that need temp files break - No user database = can't look up UIDs - No /etc/hosts, /etc/resolv.conf = DNS may fail
FROM scratch
# Minimum you usually need:
COPY --from=build /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY --from=build /etc/passwd /etc/passwd
COPY --from=build /tmp /tmp
COPY --from=build /server /server
11. Not Rebuilding After Base Image CVE¶
# Your image was built 3 months ago
# The base image has had 5 security patches since then
# Your image still has the old, vulnerable base
# Fix: rebuild regularly
docker build --no-cache -t myapp:latest .
# --no-cache forces re-pulling the base image
# Better: automate with Dependabot or Renovate
# Best: weekly CI rebuild pipeline (see street_ops)
12. UBI microdnf Pitfalls¶
# UBI minimal uses microdnf, not dnf
# microdnf has fewer features:
# - No dnf history
# - No dnf groups
# - No module streams
# - Fewer options
# WRONG
RUN dnf install -y curl # dnf not installed in ubi-minimal!
# RIGHT
RUN microdnf install -y curl && microdnf clean all
Image Optimization Footguns¶
13. Copying Entire Build Context Before Installing Dependencies¶
# BAD: Any source change invalidates the pip install cache
COPY . .
RUN pip install -r requirements.txt
# GOOD: Copy dependency file first, install, then copy source
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
14. No .dockerignore¶
Without .dockerignore, the build context includes .git/ (often hundreds of MB), node_modules/, __pycache__/, .env files, and test data. CI builds are slow and images contain unnecessary files.
Fix: Create .dockerignore excluding: .git, node_modules, __pycache__, *.pyc, .env*, tests/, docs/, *.md.
15. Secrets Baked Into Build Arguments¶
ARG DB_PASSWORD=secret in the Dockerfile is visible in docker history. Multi-stage builds do not help — build args are recorded in image metadata.
Fix: Use BuildKit secrets: RUN --mount=type=secret,id=dbpass cat /run/secrets/dbpass. Never pass secrets as build args.
16. Not Squashing Layers After Large File Add/Delete¶
You COPY large-file.tar.gz ., extract it, then RUN rm large-file.tar.gz. The file still exists in an earlier layer. The image is bloated.
Fix: Download, extract, and clean up in a single RUN. Or use multi-stage builds where the large file exists only in the build stage.
Image Scanning Footguns¶
17. Scanning Only on Push to Main¶
A critical CVE enters through a PR that was never scanned. By the time the main-branch scan catches it, the vulnerable image is already deployed.
Fix: Run scans on every PR, not just main. Use --exit-code 1 --severity CRITICAL,HIGH to block merges.
18. Ignoring All Unfixed Vulnerabilities Unconditionally¶
You use --ignore-unfixed globally. A CVE exists with a fix available in a newer base image version, but since it was briefly "unfixed" when first detected, your suppression rule hides it permanently.
Fix: Use --ignore-unfixed as a CI convenience but run periodic full scans (weekly) without it. Review unfixed findings for available workarounds.
19. No Registry Scanning -- Only Point-in-Time CI Scans¶
You scan images at build time. A new CRITICAL CVE is published the next day for a package in your running images. You don't know until someone rebuilds.
Fix: Enable continuous registry scanning (ECR Enhanced Scanning, Harbor with Trivy). Alert on new findings for deployed images.
20. Missing the --security-checks secret Flag¶
Trivy can detect secrets (AWS keys, private keys, passwords) baked into image layers. Most teams only run trivy image which defaults to vulnerability scanning. Secrets pass through undetected.
Fix: trivy image --scanners vuln,secret myapp:latest.