Diagnostic Questions¶
Before revealing the investigation path:¶
-
CI builds fail with
failed to compute cache keybut--no-cachebuilds succeed. Where is the cache stored, and how would you verify the cache is intact? -
The cache manifest exists in the registry but the referenced blob does not. What could cause a manifest to reference a non-existent blob? What operation could leave the registry in an inconsistent state?
-
The registry's persistent volume is at 96% capacity and garbage collection has been failing. How does a full filesystem lead to cache corruption, and why does GC failure make it worse?
-
The fix involves resizing a PVC, running GC, and rebuilding the cache. Why are these Kubernetes operations (Domain C) rather than CI pipeline configuration changes (Domain A) or Linux disk management (Domain B)?
-
How would you prevent a full registry volume from breaking CI pipelines in the future? What monitoring, retention policies, or architectural changes would help?