Kubernetes Storage — Trivia & Interesting Facts¶

Surprising, historical, and little-known facts about Kubernetes storage.

Kubernetes storage was "terrible" for its first three years¶

Early Kubernetes (pre-1.6) had only in-tree volume plugins hardcoded into the kubelet binary. Adding support for a new storage system required modifying Kubernetes source code, compiling a new kubelet, and waiting for a release. The CSI (Container Storage Interface) specification, which reached GA in Kubernetes 1.13 (December 2018), decoupled storage from Kubernetes entirely — storage vendors could now ship their own plugins as containers.

PersistentVolumeClaims were inspired by the requesting model in cloud computing¶

The PV/PVC separation was an intentional design choice: cluster admins provision PersistentVolumes (like allocating disk in a data center), and developers request storage via PersistentVolumeClaims (like requesting resources from IT). StorageClasses added in 1.4 automated the provisioning step — a PVC referencing a StorageClass triggers automatic volume creation, eliminating the admin bottleneck.

emptyDir volumes are more dangerous than they appear¶

An emptyDir volume is created when a pod starts and deleted when the pod terminates. Seems simple, but: emptyDir uses the node's filesystem by default, meaning a pod writing to emptyDir can fill the node's disk and trigger eviction of all pods on that node. The sizeLimit field (added in 1.8) enables enforcement, but only when the emptyDir is backed by memory (medium: Memory); filesystem-backed sizeLimit is advisory only unless eviction policies are configured.

hostPath volumes are the #1 container escape vector¶

Mounting a hostPath volume gives a container direct access to the host filesystem. Mounting / or /etc as hostPath gives the container root-equivalent access to the node. Pod Security Standards (and the earlier PodSecurityPolicies) restrict hostPath for this reason. Despite the risk, hostPath is legitimately needed for node-level agents (log collectors, monitoring daemons), making it a perennial security tension.

ReadWriteOnce does not mean "one pod" — it means "one node"¶

The ReadWriteOnce (RWO) access mode means the volume can be mounted read-write by pods on a single node. Multiple pods on the same node can mount an RWO volume simultaneously. This surprises teams who expect RWO to enforce single-pod access. ReadWriteOncePod (RWOP), added as GA in Kubernetes 1.29, finally provides true single-pod exclusive access, which is critical for applications that cannot tolerate concurrent writes.

Volume snapshots enable point-in-time backup without downtime¶

VolumeSnapshots (GA in Kubernetes 1.20) create a point-in-time copy of a PersistentVolume through the CSI driver. The snapshot is copy-on-write, so it is nearly instantaneous regardless of volume size. You can then create a new PVC from the snapshot for testing, backup, or migration. Before VolumeSnapshots, backup required either stopping the application or using application-level tools — Kubernetes had no native volume backup mechanism.

The Retain reclaim policy saves data but creates orphaned volumes¶

PersistentVolumes have three reclaim policies: Retain, Delete, and Recycle (deprecated). Retain keeps the volume and its data after the PVC is deleted, but the PV enters a Released state and cannot be bound to a new PVC without manual intervention (clearing the claimRef). Teams using Retain for safety often discover hundreds of orphaned volumes consuming expensive cloud storage months later.

CSI drivers run as privileged pods — by necessity¶

CSI node plugins must mount and unmount volumes on the host filesystem, which requires host-level access. They run as DaemonSets with privileged: true, hostNetwork: true, and direct access to /dev, /sys, and the kubelet directory. This makes CSI drivers one of the most privileged workloads in any cluster — a compromised CSI driver has near-complete control over every node.

Local PersistentVolumes trade portability for performance¶

Local PVs (GA in Kubernetes 1.14) expose a node's local disk as a PersistentVolume. They provide bare-metal disk performance (no network hop) but permanently bind the pod to a specific node. If that node fails, the data is inaccessible until the node recovers. Local PVs are popular for databases (Cassandra, Kafka, Elasticsearch) where replication is handled at the application level and raw I/O performance matters more than portability.

Ephemeral volumes in the CSI spec solved the "temporary storage" gap¶

Generic ephemeral volumes (GA in 1.23) let you create CSI-backed volumes that are automatically created and deleted with the pod — like emptyDir but with the full capabilities of a CSI driver (encryption, performance tiers, snapshots during the pod's lifetime). This was critical for workloads needing temporary scratch space with specific performance characteristics, like ML training jobs requiring NVMe-speed temporary storage.

Storage capacity tracking prevents scheduling pods to nodes that cannot provision volumes¶

Before CSI Storage Capacity (GA in 1.24), the scheduler could assign a pod to a node where the local CSI driver had no remaining disk capacity. The pod would be scheduled, the volume creation would fail, and the pod would be stuck Pending. Storage capacity tracking lets CSI drivers report available capacity per node, allowing the scheduler to make informed decisions. This was essential for local storage and distributed storage systems.