Decision Tree: Container Running as Root¶

Category: Security Response Starting Question: "I found a container running as root — what's the risk and what do I do?" Estimated traversal: 2-5 minutes Domains: security, kubernetes, containers, devops

The Tree¶

I found a container running as root — what's the risk and what do I do?
│
├── Is this in production, staging, or development?
│   │
│   ├── Development / local / CI runner only
│   │   └── Lower urgency — but still fix it so bad habits don't reach prod
│   │       → Document exception + schedule fix; skip to "Fix with image change"
│   │
│   └── Production or staging — proceed with full triage below
│
├── Is the container privileged? (`--privileged` flag or `privileged: true` in securityContext)
│   ```bash
│   # Check all pods for privileged containers
│   kubectl get pods --all-namespaces -o json | \
│     jq -r '.items[] | select(.spec.containers[].securityContext.privileged==true) |
│             "\(.metadata.namespace)/\(.metadata.name)"'
│
│   # Check a specific pod
│   kubectl get pod <pod> -o jsonpath='{.spec.containers[*].securityContext.privileged}'
│   ```
│   │
│   ├── YES — container is privileged
│   │   │
│   │   ├── Does it also mount the host filesystem or Docker socket?
│   │   │   ```bash
│   │   │   kubectl get pod <pod> -o json | jq '.spec.volumes[] | select(.hostPath)'
│   │   │   kubectl get pod <pod> -o json | \
│   │   │     jq '.spec.volumes[] | select(.hostPath.path=="/var/run/docker.sock")'
│   │   │   ```
│   │   │   │
│   │   │   ├── YES — privileged + host mount
│   │   │   │   └── ✅ ACTION: CRITICAL — Escalate + Contain
│   │   │   │       This is full host escape capability.
│   │   │   │       An attacker with shell in this container owns the node.
│   │   │   │
│   │   │   └── NO — privileged but no host mount
│   │   │       └── Still critical — privileged containers can use many kernel exploits
│   │   │           └── ✅ ACTION: Escalate + Fix (High Priority)
│   │   │               Remove privileged flag; find why it was needed
│   │   │
│   │   └── Why does the container need privileged mode?
│   │       (Network tools? sysctl changes? Direct hardware access? Legacy reason?)
│   │       │
│   │       ├── No clear reason — someone added it for convenience
│   │       │   └── Remove immediately; it was never needed
│   │       │
│   │       └── Legitimate reason (e.g., node agent, CNI plugin, eBPF tool)
│   │           └── Document the exception formally with security team sign-off
│   │               Add admission controller rule to limit to known namespaces
│   │
│   └── NO — container is not privileged (just running as UID 0)
│       │
│       ├── Does it mount the host filesystem?
│       │   ```bash
│       │   kubectl get pod <pod> -o json | \
│       │     jq '.spec.volumes[] | select(.hostPath) | .hostPath.path'
│       │   ```
│       │   │
│       │   ├── YES — hostPath mount
│       │   │   │
│       │   │   ├── Is the mount path a critical system path?
│       │   │   │   (/, /etc, /var/run, /proc, /sys, /root, /home, /usr)
│       │   │   │   │
│       │   │   │   ├── YES — mounts critical host path
│       │   │   │   │   └── ✅ ACTION: Escalate + Fix (High Priority)
│       │   │   │   │       Root in container + write access to host path = host escape
│       │   │   │   │
│       │   │   │   └── NO — mounts non-critical path (e.g., /tmp/hostdir, /data/logs)
│       │   │   │       └── Moderate risk — fix root user issue anyway
│       │   │   │           → ✅ ACTION: Fix Now — add runAsNonRoot
│       │   │   │
│       │   │   └── Is the mount read-only?
│       │   │       `kubectl get pod <pod> -o json | jq '.spec.containers[].volumeMounts[] | select(.readOnly)'`
│       │   │       │
│       │   │       ├── readOnly: true — risk significantly reduced
│       │   │       │   └── ✅ ACTION: Fix the root user issue; mount risk is mitigated
│       │   │       │
│       │   │       └── Not read-only — writable host mount
│       │   │           └── ✅ ACTION: Fix Now — both root user and mount policy
│       │   │
│       │   └── NO — no host mounts
│       │       └── Non-privileged root in a container with no host access
│       │           → Lower blast radius, but still needs fixing
│       │
│       ├── Does the container need root for a legitimate reason?
│       │   (Binding port < 1024? Volume ownership on root-owned paths?
│       │    Running a legacy init system? Network namespace manipulation?)
│       │   │
│       │   ├── YES — legitimate need for root
│       │   │   │
│       │   │   ├── Can the need be addressed without root?
│       │   │   │   - Port < 1024: use `CAP_NET_BIND_SERVICE` capability only
│       │   │   │   - File ownership: use `initContainer` to chown, then drop to non-root
│       │   │   │   - Init system: use tini or dumb-init as non-root
│       │   │   │   │
│       │   │   │   ├── YES — workaround exists
│       │   │   │   │   └── ✅ ACTION: Fix with Capability Drop + Non-Root User
│       │   │   │   │
│       │   │   │   └── NO — genuinely requires root (rare)
│       │   │   │       └── ✅ ACTION: Document Exception Formally
│       │   │   │           Apply compensating controls (read-only root FS,
│       │   │   │           seccomp profile, AppArmor, NetworkPolicy)
│       │   │   │
│       │   │   └── Is there a working non-root alternative?
│       │   │       (Official non-root image variant? Distroless? UBI minimal?)
│       │   │       │
│       │   │       ├── YES → ✅ ACTION: Switch to non-root image variant
│       │   │       │
│       │   │       └── NO → Modify Dockerfile to add non-root user
│       │   │
│       │   └── NO — no legitimate reason for root
│       │       │
│       │       ├── Is it a simple fix (image already supports non-root)?
│       │       │   `docker inspect <image> | jq '.[].Config.User'`
│       │       │   │
│       │       │   ├── User already set in image (e.g., "1000" or "appuser")
│       │       │   │   └── ✅ ACTION: Fix Now — add securityContext to manifest
│       │       │   │       Just add runAsNonRoot + runAsUser to pod spec
│       │       │   │
│       │       │   └── User not set in image — image runs as root by default
│       │       │       └── ✅ ACTION: Fix with Image Change
│       │       │           Modify Dockerfile to add non-root user
│       │       │
│       │       └── Is a PSA/OPA/Kyverno policy supposed to block this?
│       │           ```bash
│       │           # Pod Security Admission (replaced PSP, removed in K8s 1.25)
│       │           kubectl label --dry-run=server --overwrite ns <namespace> \
│       │             pod-security.kubernetes.io/enforce=restricted
│       │           kubectl get ns <namespace> -o jsonpath='{.metadata.labels}' | grep pod-security
│       │           kubectl get clusterpolicy -n kyverno 2>/dev/null  # Kyverno
│       │           kubectl get constrainttemplate 2>/dev/null  # OPA Gatekeeper
│       │           ```
│       │           │
│       │           ├── Policy exists but this container got through
│       │           │   └── ✅ ACTION: Fix Policy Gap — enforcement mode?
│       │           │       Warn vs Enforce mode; namespace exemptions?
│       │           │
│       │           └── No policy in place → add one
│       │               └── ✅ ACTION: Add Admission Controller Policy
│       │
└── What is the blast radius if this container is compromised?
    │
    ├── High blast radius (data store, payment processing, secret access, internet-facing)
    │   └── Treat as P1 — fix within hours, not days
    │
    └── Low blast radius (internal utility, read-only job, isolated namespace)
        └── Fix within the sprint — document, assign, track

Node Details¶

Check 1: Is the container privileged?¶

Command/method:

# Find all privileged containers across the cluster
kubectl get pods --all-namespaces -o json | \
  jq -r '.items[] | . as $pod |
    .spec.containers[] | select(.securityContext.privileged==true) |
    "\($pod.metadata.namespace)/\($pod.metadata.name)/\(.name)"'

# Check initContainers too
kubectl get pods --all-namespaces -o json | \
  jq -r '.items[] | . as $pod |
    .spec.initContainers[]? | select(.securityContext.privileged==true) |
    "INIT: \($pod.metadata.namespace)/\($pod.metadata.name)/\(.name)"'

What you're looking for: The privileged: true flag gives the container nearly full access to the host kernel — all devices, all capabilities, no seccomp filtering. It is equivalent to running as root on the host. Common pitfall: Checking only the pod spec, not the container spec. Privilege is set per-container (spec.containers[].securityContext), not just at the pod level (spec.securityContext). Both must be checked.

Check 2: Does it mount the Docker socket or host filesystem?¶

Command/method:

# Check for Docker socket mount (full Docker daemon access = host escape)
kubectl get pods --all-namespaces -o json | \
  jq -r '.items[] | select(.spec.volumes[]?.hostPath.path=="/var/run/docker.sock") |
          "\(.metadata.namespace)/\(.metadata.name)"'

# Check for any hostPath mounts
kubectl get pod <pod> -n <namespace> -o json | \
  jq '.spec.volumes[] | select(.hostPath) | {name: .name, path: .hostPath.path}'

# Check which containers mount which volumes
kubectl get pod <pod> -n <namespace> -o json | \
  jq '.spec.containers[] | {container: .name, mounts: .volumeMounts}'

What you're looking for: /var/run/docker.sock is the worst case — any process that can talk to the Docker daemon can escape to the host. Any hostPath mount to a sensitive directory is also dangerous. Even read-only mounts to /etc expose sensitive configuration. Common pitfall: Volumes are defined at the pod level but mounted at the container level. A pod may define a hostPath volume but only mount it in one of its containers. Check both the volume definitions and the container's volumeMounts.

Check 3: What does the container need root for?¶

Command/method:

# Check what capabilities the process actually uses
# Run the container and inspect
docker run --rm <image> id
docker run --rm <image> cat /proc/1/status | grep Cap

# Decode capability hex to human-readable
capsh --decode=<hex>

# Check if port binding is the reason (ports < 1024 require root OR CAP_NET_BIND_SERVICE)
kubectl get pod <pod> -o json | jq '.spec.containers[].ports[]'

# Check if init system is the reason
kubectl get pod <pod> -o json | jq '.spec.containers[].command'

What you're looking for: The specific kernel capability or system operation that requires root. In most cases, the real need is one narrow capability (like NET_BIND_SERVICE), not full root. Root is rarely actually required. Common pitfall: "It's always been this way" is not a legitimate reason for root. The original author may have added root out of convenience. Actually test whether the container works as non-root before assuming it cannot.

Check 4: Is there a policy that should have blocked this?¶

Command/method:

# Kyverno — check for require-run-as-non-root policy
kubectl get clusterpolicy require-run-as-non-root 2>/dev/null -o yaml | grep action

# OPA Gatekeeper — list constraint templates
kubectl get constrainttemplate k8srequiredusers 2>/dev/null

# Pod Security Standards (Kubernetes 1.23+)
kubectl get ns <namespace> -o json | jq '.metadata.labels | to_entries[] | select(.key | contains("pod-security"))'

# PSP was removed in K8s 1.25 — use Pod Security Admission (PSA) instead
# kubectl get psp restricted -o yaml  # no longer available on 1.25+
kubectl get ns <namespace> -o json | jq '.metadata.labels | to_entries[] | select(.key | contains("pod-security"))'

# Check if namespace is exempt from any policy
kubectl get ns <namespace> --show-labels

What you're looking for: Whether there is a policy in place and whether it is in warn or enforce mode. A policy in warn mode will log violations but not block them — this is how containers running as root slip through despite "having a policy." Common pitfall: Namespaces created before the policy was applied may be exempt. Check that the admission controller webhook is applied to the namespace in question, not just cluster-wide in theory.

Terminal Actions¶

✅ Action: Fix Now — Add `securityContext.runAsNonRoot: true`¶

Do: 1. Verify the image has a non-root user defined:

docker run --rm --entrypoint="" <image> id
# Should show a non-root UID (not 0)

2. Edit the deployment:

kubectl edit deployment <name> -n <namespace>

Add to the container spec:

securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  allowPrivilegeEscalation: false
  readOnlyRootFilesystem: true
  capabilities:
    drop: ["ALL"]

3. Apply and verify rollout:

kubectl rollout status deployment/<name>
kubectl get pods -l app=<name> -o jsonpath='{.items[*].spec.containers[*].securityContext}'

4. Run health check: curl -s http://<service>/health Verify: kubectl get pod <pod> -o json | jq '.spec.containers[].securityContext' shows runAsNonRoot: true. Pod is Running and healthy. Application logs show no permission errors.

✅ Action: Fix with Image Change — Modify Dockerfile¶

Do: 1. Add a non-root user to the Dockerfile:

# At the end of the build stage, before the final CMD:
RUN groupadd --gid 1000 appgroup && \
    useradd --uid 1000 --gid appgroup --shell /bin/bash --create-home appuser

# Fix ownership of any files the app needs to write
RUN chown -R appuser:appgroup /app /tmp

USER appuser

2. Build and test locally:

docker build -t <image>:nonroot-test .
docker run --rm --user 1000 <image>:nonroot-test id
docker run --rm --user 1000 <image>:nonroot-test <entrypoint-command>

3. Push and deploy; monitor for file permission errors in logs Verify: docker run --rm <image>:new id shows uid=1000. Container starts and passes health check. No permission denied in application logs.

✅ Action: Escalate + Contain (Privileged + Host Mount)¶

Do: 1. Immediately notify security team — this is a critical vulnerability 2. Assess whether the container has been compromised:

# Check for unexpected processes
kubectl exec <pod> -- ps aux
# Check for outbound connections
kubectl exec <pod> -- ss -tnp
# Check for written files in sensitive locations
kubectl exec <pod> -- find /host -newer /tmp -type f 2>/dev/null | head -20

3. If no evidence of compromise: apply NetworkPolicy to restrict egress immediately 4. Deploy a replacement pod with correct security context; drain the vulnerable pod 5. Remove the privileged flag and host mount from the manifest; redeploy 6. Add an admission controller rule to block privileged + hostPath combinations Verify: New pod runs without privileged flag. No host mounts. Admission controller blocks future equivalent deployments.

✅ Action: Fix with Capability Drop + Non-Root User¶

Do: 1. Identify the required capability (example: binding port 443 needs NET_BIND_SERVICE) 2. Drop all capabilities and add back only the needed one:

securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  allowPrivilegeEscalation: false
  capabilities:
    drop: ["ALL"]
    add: ["NET_BIND_SERVICE"]  # Only if required for port binding

3. Alternatively for port binding: use a service to expose port 443 externally while the container listens on 8443 4. Test that the application starts and functions correctly Verify: Pod running as UID 1000. Application serving correctly. kubectl exec <pod> -- id shows non-root UID.

✅ Action: Add Admission Controller Policy¶

Do: 1. Install Kyverno if not present: helm install kyverno kyverno/kyverno -n kyverno --create-namespace 2. Apply a require-non-root policy:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-run-as-non-root
spec:
  validationFailureAction: enforce
  rules:
    - name: check-runAsNonRoot
      match:
        resources:
          kinds: [Pod]
      validate:
        message: "Containers must not run as root. Set securityContext.runAsNonRoot=true."
        pattern:
          spec:
            containers:
              - securityContext:
                  runAsNonRoot: true

3. Start in audit mode (change enforce to audit) and review violations before switching to enforce Verify: kubectl apply of a root-running pod is blocked with the policy message. Existing pods are reported as violations in audit mode.

⚠️ Escalation: Privileged Container with Host Mount¶

When: Container is running privileged AND has a host filesystem or Docker socket mount — especially in production. Who: Security team lead immediately; CISO if there is any evidence of active exploitation. Include in page: Pod name, namespace, node name, what volumes are mounted (paths), whether the container is internet-facing, whether there is any evidence of exploitation (unusual processes, outbound connections), and proposed containment plan.

Edge Cases¶

The container is a DaemonSet for a legitimate node agent (Datadog, Falco, Fluent Bit, CNI plugin): These legitimately need host access. Document the exception in your security posture, apply the most restrictive securityContext that still allows function, and ensure the workload is from a trusted image with a pinned digest.
The container is running as root but readOnlyRootFilesystem: true is set: This reduces but does not eliminate risk. A read-only root filesystem still allows exploiting kernel vulnerabilities. Fix the root user issue separately.
The Dockerfile is not in your control (third-party image): Use a runAsUser override in the pod spec: securityContext.runAsUser: 1000. This may fail if the image has files owned only by root — test carefully and check for permission errors.
The container has been running as root for years with no incidents: This is not evidence of safety; it is evidence of being lucky. Apply the fix — the blast radius does not decrease over time.
PSP is deprecated in your Kubernetes version: PodSecurityPolicy was removed in Kubernetes 1.25. Migrate to Pod Security Standards (namespace labels) or a third-party admission controller (Kyverno, OPA Gatekeeper).

Cross-References¶

Topic Packs: security, kubernetes, containers
Related trees: found-vulnerability.md, dependency-cve.md
Runbooks: incident_response.md

Decision Tree: Container Running as Root¶

The Tree¶

Node Details¶

Check 1: Is the container privileged?¶

Check 2: Does it mount the Docker socket or host filesystem?¶

Check 3: What does the container need root for?¶

Check 4: Is there a policy that should have blocked this?¶

Terminal Actions¶

✅ Action: Fix Now — Add securityContext.runAsNonRoot: true¶

✅ Action: Fix with Image Change — Modify Dockerfile¶

✅ Action: Escalate + Contain (Privileged + Host Mount)¶

✅ Action: Fix with Capability Drop + Non-Root User¶

✅ Action: Add Admission Controller Policy¶

⚠️ Escalation: Privileged Container with Host Mount¶

Edge Cases¶

Cross-References¶

Pages that link here¶

✅ Action: Fix Now — Add `securityContext.runAsNonRoot: true`¶