Portal | Level: L2: Operations | Topics: Security Scanning, CI/CD | Domain: Security
Scenario: CI Failed Due to Vulnerability Scan¶
The Prompt¶
"Our CI pipeline started failing this morning. The Trivy scan step reports 3 CRITICAL vulnerabilities in our container image. These weren't there yesterday. Nothing in our code changed. What do you do?"
Initial Report¶
CI notification: "Pipeline #1847 FAILED at stage 'security-scan'. Trivy found 3 CRITICAL vulnerabilities in grokdevops:latest. This is blocking the release of the billing fix that customers are waiting for."
Constraints¶
- Time pressure: You have 15 minutes before the next escalation. A customer-facing bugfix is blocked by this scan failure.
- Limited access: You can modify the Dockerfile and
.trivyignorebut cannot disable the scan gate. Base image changes require a rebuild that takes ~5 minutes.
Observable Evidence¶
- CI output: Trivy scan shows 3 CRITICAL CVEs in
libcurl,openssl, andzlib— all from the base image, not application code. - CVE details: All three have
FixedVersionpopulated, indicating patches are available in a newer base image tag. - Logs: The previous pipeline run (yesterday) passed cleanly with the same Dockerfile.
Expected Investigation Path¶
# 1. Reproduce locally
docker build -t grokdevops:test .
trivy image --severity CRITICAL,HIGH grokdevops:test
# 2. Check what changed — likely base image CVE database update
trivy image --severity CRITICAL grokdevops:test --format json | jq '.Results[].Vulnerabilities[] | {VulnerabilityID, PkgName, InstalledVersion, FixedVersion}'
# 3. Check if fixes are available
# Look at "FixedVersion" — if populated, update is available
# 4. If the base image has an update
# Update FROM line in Dockerfile to latest patch, rebuild, rescan
Strong Answer¶
"This is almost certainly a CVE database update, not a code change. Trivy's vulnerability database updates daily, so new CVEs get flagged against existing packages. I'd first reproduce locally with trivy image and check which packages are affected. For each CRITICAL finding, I'd check if a FixedVersion exists — if yes, we need to update the base image or the specific package. The fastest fix is usually bumping the base image to the latest patch release (e.g., python:3.12-slim to the latest digest). If no fix exists yet, we have options: add a .trivyignore file for known/accepted risks with a review date, or switch to a different base image. The key is having a process: scan results shouldn't block deploys forever, but they should be triaged within a defined SLA."
Common Traps¶
- Assuming someone changed the code — CVE databases update independently
- Ignoring CRITICAL vulns — "it wasn't there yesterday" doesn't mean it's not real
- Not knowing about
.trivyignore— it's a legitimate risk acceptance mechanism - Not mentioning base image management — shows lack of container security awareness
Practice and Links¶
- Lab:
training/interactive/runtime-labs/lab-runtime-06-trivy-fail-to-green/
Wiki Navigation¶
Related Content¶
- Adversarial Interview Gauntlet (30 sequences) (Scenario, L2) — CI/CD
- CI Pipeline Documentation (Reference, L1) — CI/CD
- CI/CD Drills (Drill, L1) — CI/CD
- CI/CD Flashcards (CLI) (flashcard_deck, L1) — CI/CD
- CI/CD Pipelines & Patterns (Topic Pack, L1) — CI/CD
- Circleci Flashcards (CLI) (flashcard_deck, L1) — CI/CD
- Dagger / CI as Code (Topic Pack, L2) — CI/CD
- Deep Dive: CI/CD Pipeline Architecture (deep_dive, L2) — CI/CD
- GitHub Actions (Topic Pack, L1) — CI/CD
- Jenkins Flashcards (CLI) (flashcard_deck, L1) — CI/CD
Pages that link here¶
- CI Pipeline
- CI/CD - Skill Check
- CI/CD Drills
- CI/CD Pipeline Architecture
- CI/CD Pipelines - Primer
- Dagger
- Dagger / CI as Code - Primer
- GitHub Actions - Primer
- Interview Gauntlet: CI/CD for a Monorepo
- Interview Gauntlet: Container Image Build and Distribution Pipeline
- Interview Gauntlet: Flaky CI Build
- Interview Gauntlet: Improving Team Development Workflow
- Interview Gauntlet: When Automation Went Wrong
- Interview Scenarios
- Level 5: SRE & Incident Response