Skip to content

Pattern: PVC Reclaim Policy Delete

ID: FP-038 Family: Configuration Landmine Frequency: Common Blast Radius: Single Service Detection Difficulty: Obvious (but irreversible)

The Shape

A Kubernetes PersistentVolume's reclaim policy (Retain or Delete) controls what happens to the underlying storage when the PVC is deleted. The Delete policy (often the default for dynamically-provisioned volumes from cloud providers) automatically deletes the underlying storage resource (EBS volume, GCP persistent disk, etc.) when the PVC is deleted. Deleting a PVC for maintenance, debugging, or cleanup permanently destroys the data with no confirmation prompt and no undo.

How You'll See It

In Kubernetes

$ kubectl delete pvc mysql-data    # Routine cleanup operation
persistentvolumeclaim "mysql-data" deleted
# The EBS volume behind it is now deleted. Data: gone.

$ kubectl get pv
NAME           RECLAIM POLICY
pvc-abc123     Delete         was automatically deleted with the PVC
The PVC deletion is instantaneous. The underlying cloud volume deletion is triggered immediately. No confirmation. No warning. No undo.

In CI/CD

A CI pipeline's "cleanup" step deletes all resources in a namespace: kubectl delete all --all -n myapp. This deletes Pods, Services, Deployments — but NOT PVCs (they're not in all). A second cleanup step explicitly deletes PVCs. All data in the namespace is gone.

The Tell

PVC was deleted. The storage class has reclaimPolicy: Delete. kubectl get pv shows the PV is gone (status would be Released for Retain policy). The underlying cloud volume no longer exists.

Common Misdiagnosis

Looks Like But Actually How to Tell the Difference
PVC deletion was safe (Retain policy assumed) Delete policy triggered immediate storage deletion Check StorageClass reclaimPolicy; check if PV still exists
Data on a different volume Volume deleted with PVC Cloud console shows the volume is gone
Application misconfiguration Data was deleted New pod can't find data because it's genuinely gone, not misconfigured

The Fix (Generic)

  1. Immediate: Check cloud provider's volume deletion logs (AWS CloudTrail, GCP audit logs) to confirm deletion; attempt snapshot restoration if available.
  2. Short-term: For critical data, use StorageClass with reclaimPolicy: Retain; manually delete storage only after confirming data is backed up.
  3. Long-term: Use reclaimPolicy: Retain for all production StatefulSets; create a checklist that includes "verify PV reclaim policy" before deleting any PVC; add RBAC restrictions on PVC deletion in production namespaces.

Real-World Examples

  • Example 1: Database admin ran kubectl delete pvc postgres-data intending to resize the PVC (requires delete + recreate). StorageClass had Delete policy. PostgreSQL data (3 months of production data) deleted in 2 seconds. No backup (see FP-025).
  • Example 2: Staging namespace cleanup: kubectl delete pvc --all -n staging. Staging PVCs used the same StorageClass as production (same Delete policy). All staging data gone. Staging database had to be re-seeded from scratch.

War Story

"I just need to resize the PVC." Our cloud provider required delete-and-recreate to resize (before in-place resize was supported). I deleted the PVC. Heard the Slack channel go quiet. Checked: 3 months of ML training data, gone. The StorageClass had reclaimPolicy: Delete. We'd never checked. The backup (FP-025) was 6 weeks old. We ran the 6-week restore and accepted the 6-week data loss. Changed all production StorageClasses to reclaimPolicy: Retain that afternoon. PVC deletion now requires a second step: manually delete the PV after confirming the data is safe.

Cross-References

  • Topic Packs: k8s-ops, disk-and-storage-ops
  • Footguns: k8s-ops/footguns.md — "PVC reclaim policy Delete (default) on databases"
  • Related Patterns: FP-025 (untested backup — the only recovery path after this), FP-048 (device name confusion — same "irreversible deletion" risk)