Portal | Level: L2: Operations | Topics: GitOps | Domain: DevOps & Tooling

Scenario: Config Drift Detected in Production¶

The Prompt¶

"Our GitOps tool (ArgoCD) shows the production deployment as 'OutOfSync'. Someone apparently ran kubectl commands directly against production. The app is working fine but the declared state doesn't match. How do you handle this?"

Initial Report¶

ArgoCD alert: "Application grokdevops is OutOfSync. Live state differs from Git. Detected manual changes to Deployment spec. Last synced 2 hours ago."

Constraints¶

Time pressure: You have 15 minutes before the next escalation. The drift may mask a larger issue or be an active hotfix.
Limited access: You have read access to the cluster and Git repo. Force-syncing ArgoCD requires approval from the tech lead. You cannot identify who made the change without audit logs.

Observable Evidence¶

ArgoCD UI: The Deployment resource shows a yellow "OutOfSync" badge. Diff view shows spec.replicas changed from 3 to 5 and an extra environment variable DEBUG=true added.
Dashboard: Application health is green — the app is functioning normally despite the drift.
Logs: No recent CI/CD pipeline runs; the change was made via direct kubectl access.

Expected Investigation Path¶

# 1. Identify what drifted
kubectl get deploy grokdevops -n grokdevops -o yaml > /tmp/live-state.yaml
helm get manifest grokdevops -n grokdevops > /tmp/declared-state.yaml
diff /tmp/declared-state.yaml /tmp/live-state.yaml

# 2. Check for common drift patterns
kubectl get deploy grokdevops -n grokdevops -o jsonpath='{.spec.replicas}'  # Manual scaling?
kubectl get deploy grokdevops -n grokdevops -o json | jq '.spec.template.spec.containers[0].env'  # Injected env vars?

# 3. Decide: keep the drift or revert?
# If intentional (hotfix): codify it in values, commit, let GitOps sync
# If accidental: revert via sync

# 4. Reconcile
helm upgrade grokdevops devops/helm/grokdevops -n grokdevops -f devops/helm/values-dev.yaml

Strong Answer¶

"First: don't panic. The app works, so this isn't an outage. Config drift means the live cluster state diverged from what's declared in Git. I'd diff the live state against the Helm manifest to identify exactly what changed. Common drift: someone manually scaled replicas, added env vars for debugging, or patched a resource directly. Once identified, there are two paths: if the change was an intentional hotfix, I'd codify it in the values file, commit to Git, and let the GitOps pipeline sync it properly. If it was accidental, I'd just trigger a sync to revert to declared state. Either way, I'd then review team processes — we should have guardrails (RBAC, admission webhooks, or policy agents) to prevent direct kubectl changes in production. GitOps only works if Git is the single source of truth."

Common Traps¶

Force-syncing without understanding the drift — the drift might be an active hotfix keeping prod running
Not investigating who/why — drift is a process failure signal
Ignoring the meta-problem — this should trigger an RBAC/process review
Not knowing diff tools — helm get manifest vs live state comparison

Practice and Links¶

Lab: training/interactive/runtime-labs/lab-runtime-07-gitops-sync-and-drift/
Doc: training/library/guides/gitops-example.md

Argo Flashcards (CLI) (flashcard_deck, L1) — GitOps
GitOps (Topic Pack, L1) — GitOps
GitOps & ArgoCD Drills (Drill, L2) — GitOps
Gitops Flashcards (CLI) (flashcard_deck, L1) — GitOps
Interview: GitOps Drift Detected (Scenario, L2) — GitOps
Lab: GitOps Sync and Drift (CLI) (Lab, L2) — GitOps
Runbook: ArgoCD Out of Sync (Runbook, L2) — GitOps
Runbook: Deploy Rollback (Runbook, L1) — GitOps
Skillcheck: GitOps (Assessment, L2) — GitOps
Track: Helm & Release Ops (Reference, L1) — GitOps

Scenario: Config Drift Detected in Production¶

The Prompt¶

Initial Report¶

Constraints¶

Observable Evidence¶

Expected Investigation Path¶

Strong Answer¶

Common Traps¶

Practice and Links¶

Wiki Navigation¶

Pages that link here¶

Scenario: Config Drift Detected in Production¶

The Prompt¶

Initial Report¶

Constraints¶

Observable Evidence¶

Expected Investigation Path¶

Strong Answer¶

Common Traps¶

Practice and Links¶

Wiki Navigation¶

Related Content¶

Pages that link here¶