Thinking Out Loud: Kubernetes Storage¶
A senior SRE's internal monologue while working through a real Kubernetes storage issue. This isn't a tutorial — it's a window into how experienced engineers actually think.
The Situation¶
The Elasticsearch cluster in production is throwing repeated ReadOnlyFileSystemException errors. The logging pipeline is dropping events because ES can't write to its data volumes. The PVCs are backed by AWS EBS gp3 volumes.
The Monologue¶
Elasticsearch going read-only. There are two common causes: disk watermark (ES thinks the disk is too full and protects itself) or the actual underlying filesystem going read-only (usually an EBS issue). Let me figure out which one.
Three data nodes, all Running. Let me check ES cluster health first.
Status: yellow. Unassigned shards. That's consistent with a node refusing writes. Let me check the disk allocation.
Node elasticsearch-data-2 shows disk usage at 89%. The default flood-stage watermark in ES is 95%, and the high watermark is 90%. At 89% it's not flood-stage yet, but it's past the high watermark which means new shard allocation is blocked to this node. But the ReadOnlyFileSystemException is something else — that's flood-stage behavior or an actual FS issue.
Let me check the node-specific settings.
kubectl exec -it elasticsearch-data-2 -n logging -- curl -s localhost:9200/_nodes/elasticsearch-data-2/stats/fs?pretty
Hmm, available: 11%. That's from ES's perspective. But let me check from the OS level inside the container.
93% used, 7% available. Wait, that's different from what ES reported. Oh — ES uses the total filesystem stats which include reserved blocks. Ext4 reserves 5% for root by default. So from the app's perspective there's 7% free, but from the filesystem's perspective the usable space for non-root is even less.
Mental Model: The Three Layers of "Disk Full"¶
There are three different "full" states for a Kubernetes volume: the PV/PVC capacity limit, the filesystem available space (minus reserved blocks), and the application-level watermark (like ES flood stage). Each has different symptoms and different fixes. Always check all three layers.
Let me check if ES has actually tripped its own read-only block.
kubectl exec -it elasticsearch-data-2 -n logging -- curl -s localhost:9200/_cluster/settings?pretty | grep -A 5 read_only
"cluster.blocks.read_only_allow_delete": "true". There it is. ES set the cluster-level read-only block. This happens at the flood stage watermark. Even though df shows 7% free, the reserved blocks mean the actual usable space crossed ES's threshold.
I need to do two things: free up space, and then clear the read-only block. Let me not just clear the block first — if the disk is actually that full, ES will just set it again immediately.
First, let me delete old indices to free space.
kubectl exec -it elasticsearch-data-0 -n logging -- curl -s localhost:9200/_cat/indices?v\&s=creation.date | head -20
We have indices going back 45 days. Our retention policy is 30 days. So there are 15 days of indices that should have been cleaned up. The ILM policy must not be running.
ILM is stopped. Someone stopped it and never restarted it. Probably during the last maintenance window. Classic.
Let me delete the old indices first to free up space, then restart ILM.
kubectl exec -it elasticsearch-data-0 -n logging -- curl -s -XDELETE 'localhost:9200/logs-2026.02.*'
Let me check disk usage again.
Down to 71% used. Good. Now let me clear the read-only block and restart ILM.
kubectl exec -it elasticsearch-data-0 -n logging -- curl -s -XPUT localhost:9200/_cluster/settings -H 'Content-Type: application/json' -d '{"persistent":{"cluster.blocks.read_only_allow_delete":null}}'
kubectl exec -it elasticsearch-data-0 -n logging -- curl -s -XPOST localhost:9200/_ilm/start
Let me verify the cluster is recovering.
Status: green, no unassigned shards. Cluster is healthy. But I need to prevent this from happening again.
Mental Model: Fix the Process, Not Just the Symptom¶
Deleting old indices fixed today's problem. Restarting ILM prevents it from recurring. But the real fix is adding monitoring: an alert on ILM status and an alert on disk watermark approaching thresholds. If you fix a symptom without fixing the detection, you'll be back here in a month.
Let me also check if we should resize the PVCs. If 30 days of data uses 71% of the volume after cleanup, we're going to hit this again as log volume grows.
Each PVC is 100Gi. 71% used for 30 days means about 2.4Gi per day. We have roughly 29Gi headroom, which gives us about 12 days of growth buffer. That's too tight for comfort.
Let me check if the StorageClass supports expansion.
True. Good. Let me expand the PVCs to 200Gi. With EBS and Kubernetes, I can do this online — no downtime required.
kubectl patch pvc elasticsearch-data-elasticsearch-data-0 -n logging -p '{"spec":{"resources":{"requests":{"storage":"200Gi"}}}}'
kubectl patch pvc elasticsearch-data-elasticsearch-data-1 -n logging -p '{"spec":{"resources":{"requests":{"storage":"200Gi"}}}}'
kubectl patch pvc elasticsearch-data-elasticsearch-data-2 -n logging -p '{"spec":{"resources":{"requests":{"storage":"200Gi"}}}}'
The EBS volume will resize in the background. Let me check the status.
kubectl get pvc -n logging -l app=elasticsearch -o custom-columns='NAME:.metadata.name,STATUS:.status.phase,CAPACITY:.status.capacity.storage,CONDITIONS:.status.conditions[*].type'
PVCs showing FileSystemResizePending. The filesystem resize happens when the kubelet picks it up. For EBS, this can take a few minutes. I'll check back.
Meanwhile, let me add the monitoring gap. We need an alert for when ILM is stopped and when disk usage exceeds 75% on ES nodes.
What Made This Senior-Level¶
| Junior Would... | Senior Does... | Why |
|---|---|---|
| Clear the ES read-only block immediately | Free disk space first, then clear the block | Clearing the block without freeing space means ES instantly re-triggers it |
Only check df -h for disk usage |
Check all three layers: PVC capacity, filesystem available (with reserved blocks), and application watermarks | Each layer has different thresholds and different symptoms |
| Delete old indices and call it done | Discover that ILM was stopped, restart it, AND resize the PVCs for growth headroom | Fixing today's symptom without fixing the automation gap and capacity trajectory means guaranteed recurrence |
| Not check if online PVC expansion is supported | Verify StorageClass allows expansion before attempting it | Some StorageClasses don't support expansion, and some require pod restart |
Key Heuristics Used¶
- Three Layers of Disk Full: In Kubernetes, always check PVC capacity, filesystem available space (including reserved blocks), and application-level watermarks independently.
- Fix the Automation, Not the Data: When a maintenance process (ILM, retention, cleanup) is broken, restarting it matters more than manually cleaning up its backlog.
- Capacity Trajectory: After resolving a capacity issue, calculate the growth rate and verify you have sufficient runway before the next occurrence.
Cross-References¶
- Primer — PV/PVC lifecycle, StorageClasses, and volume expansion mechanics
- Street Ops — EBS volume operations, online resize, and filesystem inspection
- Footguns — Ext4 reserved blocks causing earlier-than-expected "disk full" and ILM getting silently stopped