Symptoms: CI Pipeline Fails, Docker Layer Cache Corruption, Fix Is Registry GC¶
Domains: devops_tooling | linux_ops | kubernetes_ops Level: L2 Estimated time: 30-45 min
Initial Alert¶
GitHub Actions notification at 10:22 UTC:
:x: CI Build Failed — main branch
Job: build-and-push
Step: docker build
Error: "failed to solve: failed to compute cache key: failed to get digest sha256:a8f2b..."
Observable Symptoms¶
- All CI builds on the
mainbranch are failing at thedocker buildstep. - The error is
failed to compute cache keyreferencing a specific layer digest. - Feature branch builds that do not use the cache (
--no-cache) succeed. - Local Docker builds on developer machines succeed.
- The CI pipeline uses BuildKit with an inline cache from the registry:
--cache-from=type=registry,ref=registry.internal:5000/app:cache. - The registry (Harbor) UI shows the
app:cachetag exists and was last pushed 3 days ago. - Retrying the CI job fails with the same error every time.
The Misleading Signal¶
A CI pipeline failure with a Docker build error looks like a CI/CD configuration problem — maybe a broken Dockerfile, a missing build argument, a GitHub Actions runner issue, or a Docker version mismatch. The failed to compute cache key error specifically points to Docker cache, making engineers investigate the Dockerfile layer ordering, ARG usage, or COPY statements. The fact that --no-cache works reinforces the idea that the Dockerfile is fine and the cache is the problem — but the cache is in the registry, not in the CI environment.