Portal | Level: L2: Operations | Topics: Docker / Containers, Container Image Optimization (alias → container_images) | Domain: DevOps & Tooling
Docker Image Internals¶
Scope¶
This document explains Docker image internals and the related OCI image model. It covers:
- layers
- manifests
- config objects
- storage drivers / snapshotters
- overlayfs mental model
- tagging vs digests
- build cache implications
- runtime writable layer
- common misconceptions and operational consequences
Big picture¶
A Docker image is not a single opaque blob in the conceptual sense. It is a content-addressed bundle of metadata plus one or more filesystem layers.
Simplified model¶
image reference (name:tag or digest)
-> manifest
-> config object
-> ordered layer blobs
-> local unpack/snapshot storage
-> container runtime mounts layers + writable layer
An image is build-time packaged state. A container is runtime execution state derived from it.
Image identity: tag vs digest¶
Tags¶
Examples:
nginx:latestubuntu:24.04
Tags are mutable pointers. They are convenient names, not immutable truth.
Digests¶
Example form:
sha256:...
Digests identify content immutably.
Why this matters¶
If you deploy by tag:
- the same text may point to different content later
If you deploy by digest:
- you know exactly which bytes you meant
For reproducible infrastructure, digests are king and tags are rumors.
OCI image structure¶
Modern Docker images align with OCI image concepts.
Core pieces:
- manifest
- config object
- layer blobs
Manifest¶
The manifest describes:
- which config object belongs to the image
- which layers, in order, compose it
- media types and digests
Config object¶
Contains metadata such as:
- environment defaults
- command / entrypoint
- working directory
- labels
- history
- root filesystem diff IDs
- architecture/OS info
Layer blobs¶
Compressed or otherwise packaged filesystem diffs representing changes introduced at build steps.
Layers¶
Each image layer is a filesystem delta, not usually a full filesystem copy.
A layer can represent:
- added files
- modified files
- deleted files (represented via whiteout semantics in union filesystems)
Ordered stacking¶
Layers are ordered. Later layers can override or mask earlier content.
This is the core reason container images are efficient to distribute and cache:
- common base layers are reused
- only changed layers need transfer/storage
Build steps and layers¶
In a Dockerfile-style mental model, instructions often produce new layers or new image metadata.
Common consequences:
- combining operations affects layer count and cache behavior
- deleting files in a later layer does not erase them from earlier layer history for image-size purposes the way people naively imagine
- order matters for cache efficiency
Example¶
If you install huge packages in one layer and "delete" them later, the lower-layer content may still exist in image history/storage; you only masked it from the merged view.
That is why careless image construction creates obese images.
Overlayfs / overlay2 mental model¶
A common Docker storage mechanism on Linux uses OverlayFS.
Runtime view¶
lowerdirs = image layers (read-only)
upperdir = container writable layer
workdir = overlay bookkeeping
merged = presented root filesystem
Why overlay matters¶
- image layers stay read-only and shared
- each container gets its own writable layer
- reads can come from lower layers
- writes may trigger copy-up from lower layer into upper layer
Copy-up implication¶
If a container modifies a file that exists in a lower layer, OverlayFS may copy it into the writable layer first. That can be surprisingly expensive for some workloads.
Local storage representation¶
A runtime stores image content locally using content-addressed blobs and snapshot metadata. Historically Docker emphasized storage drivers; newer stacks increasingly involve containerd snapshotters and content stores.
Important idea:
- the registry representation
- the local unpacked representation
- the mounted runtime rootfs
are related but not identical views.
Writable layer vs volumes¶
A running container has a writable layer, but that is not the ideal home for durable data.
Writable layer¶
- tied to container lifecycle
- can be slower/awkward for some heavy write workloads
- disappears with container deletion unless preserved via container commit/image tricks, which is usually not how sane systems manage state
Volumes / bind mounts¶
- externalize persistent or host-coupled data
- survive container recreation as configured
- often better for databases, application data, caches that should persist, etc.
Whiteouts and deletions¶
Union filesystem semantics use whiteout markers to represent deletions of files from lower layers.
Why this matters¶
The merged rootfs says "this file is gone," but the bytes in an older layer may still exist in the image history/storage graph.
That is the image-layer version of sweeping dirt under a rug and then declaring the house immaculate.
Multi-arch images¶
An image tag may refer not to one image manifest but to a manifest list / image index containing entries for multiple platforms.
Example platforms:
linux/amd64linux/arm64
The runtime chooses the appropriate platform-specific image based on host/requested platform.
Why you care¶
- same tag may resolve differently on different architectures
- buildx and multi-platform publishing make this common
- debugging "works on my machine" image issues sometimes comes down to architecture mismatch
Pull flow¶
Simplified pull sequence:
client/runtime resolves reference
-> authenticate if needed
-> fetch manifest/index
-> determine platform
-> fetch config and missing layers by digest
-> verify digests
-> store blobs locally
-> unpack/snapshot for runtime use as needed
This is content-addressed, which is why shared layers and digest verification work.
Build cache¶
Image builds often reuse prior layers if instructions and inputs are unchanged enough.
Why cache matters¶
- dramatic speedup
- predictable rebuild behavior when designed well
- poor Dockerfile ordering destroys cache efficiency
Good general pattern¶
Put more stable steps earlier:
- base image
- package manager metadata/install patterns
- dependency manifests
- source copy later if source changes frequently
Bad pattern¶
Copy the entire repo first, then install dependencies. Tiny code change now invalidates expensive dependency layers.
Container image security implications¶
What image internals tell you operationally¶
- mutable tags are risky
- layer history can preserve data you thought you deleted
- base image lineage matters
- too many packages increase attack surface
- secrets copied during build may end up in layers/history
- image scans operate on filesystem/package content but are only part of the story
Common mistakes¶
- baking secrets into layers
- giant general-purpose base images
- relying on
latest - assuming deleting a secret in later build step removes it from image history safely
Runtime startup from image¶
When starting a container from an image:
- image reference resolves to content
- local snapshot/mount structure is prepared
- writable upper layer is added
- OCI runtime spec says what process to run
- process executes in that merged filesystem
So the image supplies filesystem and metadata defaults; the runtime supplies the process environment and isolation context.
Debugging image issues¶
Symptom: image huge¶
Likely causes:
- poor layer ordering
- package caches retained
- unnecessary tooling in runtime image
- copied build artifacts / source / test data
- deleting in later layers instead of avoiding inclusion earlier
Symptom: rebuild slow¶
Likely causes:
- cache invalidation too early in Dockerfile
- mutable base image changes
- dependency install step not isolated
- registry/cache not reused
Symptom: file "deleted" but image still large¶
Whiteout/layer-history problem.
Symptom: different results on different hosts¶
Possible causes:
- multi-arch tag resolves differently
- mutable tag drift
- different runtime unpack/storage behavior
- different build context or ignored files
Interview angles¶
Questions hidden here:
- difference between image and container
- what a layer is
- why tags are mutable and digests are preferable
- how overlayfs presents image + writable layer
- why deleting files in later layers may not shrink image as expected
- what multi-arch images are
- why Dockerfile instruction order affects caching
Strong answers tie image structure to runtime consequences.
Mental model to keep¶
A Docker image is:
- a manifest and config
- plus ordered content-addressed filesystem layers
A running container is:
- that read-only layered filesystem
- plus a writable upper layer
- plus a process launched by the runtime
If you separate build-time image identity from runtime container state, most confusion disappears.
References¶
- Docker storage drivers
- Select a storage driver
- OverlayFS storage driver
- containerd image store with Docker Engine
Wiki Navigation¶
Prerequisites¶
- Linux Ops (Topic Pack, L0)
Related Content¶
- Container Images (Topic Pack, L1) — Container Image Optimization (alias → container_images), Docker / Containers
- AWS ECS (Topic Pack, L2) — Docker / Containers
- Case Study: CI Pipeline Fails — Docker Layer Cache Corruption (Case Study, L2) — Docker / Containers
- Case Study: Container Vuln Scanner False Positive Blocks Deploy (Case Study, L2) — Docker / Containers
- Case Study: ImagePullBackOff Registry Auth (Case Study, L1) — Docker / Containers
- Containers Deep Dive (Topic Pack, L1) — Docker / Containers
- Deep Dive: Containers How They Really Work (deep_dive, L2) — Docker / Containers
- Docker (Topic Pack, L1) — Docker / Containers
- Docker Basics Flashcards (CLI) (flashcard_deck, L1) — Docker / Containers
- Docker Drills (Drill, L1) — Docker / Containers
Pages that link here¶
- Container Base Images — Primer
- Container Base Images — Street Ops
- Container Images
- Containers - How They Really Work
- Containers Deep Dive
- Containers Deep Dive - Primer
- Docker
- Docker - Skill Check
- Docker / Containers - Primer
- Docker Drills
- Scenario: Docker Container Won't Start in Production
- Symptoms
- Symptoms: CI Pipeline Fails, Docker Layer Cache Corruption, Fix Is Registry GC
- Symptoms: Container Image Vuln Scanner False Positive, Blocks Deploy Pipeline
- Track: Containers