Skip to content

Helm Ops

← Back to all decks

12 cards — 🟢 4 easy | 🟡 4 medium | 🔴 4 hard

🟢 Easy (4)

1. You changed a template but helm install gives a YAML parse error referencing a line number that does not match your template. Why?

Show answer The error line number refers to the rendered output, not the template source. Use helm template --debug to see the fully rendered YAML with line numbers and locate the actual breakage (usually a wrong nindent value or unquoted value injection).

2. You run --set replicaCount=true intending the string "true" but the template receives a boolean. How do you force a string?

Show answer Use --set-string replicaCount=true instead of --set. Helm's --set infers YAML types automatically (true becomes boolean, 123 becomes integer). The --set-string flag forces the value to remain a string regardless of content.

3. Your CI/CD pipeline runs helm install on every deploy and fails on the second run. What single command fixes this?

Show answer Use helm upgrade --install (the idempotent form). It installs the release if it does not exist, or upgrades it if it does. Combine with --atomic and --timeout for safe CI/CD deploys.

4. How do you retrieve the exact Kubernetes manifests that Helm applied for a specific release revision?

Show answer Run helm get manifest --revision . This outputs the rendered YAML manifests as they were applied for that revision. To see the values used, run helm get values --revision .

🟡 Medium (4)

1. Your pre-upgrade hook Job fails on retry because the previous Job resource still exists. What annotation fixes this?

Show answer Add helm.sh/hook-delete-policy: before-hook-creation to the Job metadata. This tells Helm to delete the previous hook resource before creating a new one, preventing name-collision failures on retry.

2. Explain the difference between --wait and --atomic on helm upgrade. When does --atomic add value over --wait alone?

Show answer --wait makes Helm wait for resources to become ready before marking success, but a failed deploy stays in the failed state. --atomic implies --wait and additionally auto-rolls back to the previous revision on failure or timeout. --atomic prevents the cluster from being left in a half-upgraded state.

3. An SRE manually scaled a Deployment to 5 replicas via kubectl. The Helm chart says 3 replicas. On next helm upgrade (chart unchanged), what happens to the replica count?

Show answer It stays at 5. Helm 3 uses a three-way merge comparing old manifest, new manifest, and live state. Since the old and new chart manifests both say 3 (no change in that field), Helm does not patch it. But if the chart changes replicas to 4, Helm patches it to 4, overwriting the manual edit.

4. A chart hardcodes namespace: monitoring in a ServiceMonitor template instead of using .Release.Namespace. What operational problem does this cause?

Show answer Helm tracks resources in the release namespace, but the ServiceMonitor is created in the monitoring namespace. When you run helm uninstall, Helm does not delete the ServiceMonitor because it only cleans up resources tracked in its release secret. The resource becomes orphaned. Always use {{ .Release.Namespace }} in templates.

🔴 Hard (4)

1. A release is stuck in pending-upgrade after a deploy crashed. helm rollback also fails. What is the recovery procedure?

Show answer Check helm history to identify the broken revision. As a last resort, delete the Helm release secret storing the broken state: kubectl delete secret sh.helm.release.v1..v -n (where N is the broken revision number). Then helm rollback to the last good revision, or helm upgrade --install to redeploy. This is destructive to Helm's state tracking so use only when rollback fails.

2. You have a pre-upgrade hook Job running database migrations that is not idempotent. The Job fails partway through, and the SRE retries the upgrade. What goes wrong and how should the hook be redesigned?

Show answer The migration runs again from the start, potentially re-applying already-completed steps and corrupting data. Redesign: make migrations idempotent (use IF NOT EXISTS, migration versioning tables), set backoffLimit: 0 or 1 to prevent Kubernetes-level retries of the broken Job, add hook-delete-policy: before-hook-creation for clean retry, and set a hook-weight if ordering among multiple hooks matters.

3. How does the helm-diff plugin help catch problems before a helm upgrade, and what is its key limitation regarding out-of-band changes?

Show answer helm diff upgrade shows a colored diff of what would change between the current deployed release and the proposed upgrade, without applying anything. This catches unintended changes from upstream chart updates, value drift, or template logic changes. Key limitation: it compares rendered manifests (old release vs new template), not live cluster state, so it may miss resources that were modified out-of-band via kubectl.

4. Your chart has a dependency on redis version 17.x. A colleague runs helm dependency update and the Chart.lock changes from 17.3.2 to 17.5.0. The deploy fails in staging. How should you manage dependency versions to prevent this?

Show answer Use helm dependency build instead of helm dependency update in CI/CD. dependency build uses the pinned versions in Chart.lock (which should be committed to git), while dependency update resolves fresh versions and rewrites the lock file. Pin exact versions in Chart.yaml when stability matters (version: "17.3.2" instead of "17.x"). Treat Chart.lock updates as deliberate changes that go through code review.