Portal | Level: L2: Operations | Topics: Kubernetes Networking | Domain: Kubernetes

Scenario: Ingress Returns 404 Intermittently¶

The Prompt¶

"Users report getting 404 errors intermittently when accessing our app through the ingress. Some requests work, some don't. The app pods are all running. What's going on?"

Initial Report¶

Customer support escalation: "Multiple customers are reporting that the app loads sometimes but gives a 'Not Found' page other times. It seems random. Started about 30 minutes ago."

Constraints¶

Time pressure: You have 15 minutes before the next escalation. Customer-facing errors are actively occurring.
Limited access: You have read access to the ingress and application namespace. Modifying the ingress controller (Traefik in kube-system) requires platform team involvement.

Observable Evidence¶

Dashboard: HTTP 404 rate is at ~30% of total requests. 200 responses still account for ~70%. Error rate correlates with specific backend pod IPs.
Ingress controller logs: Traefik logs show 404 page not found responses originating from one specific upstream pod IP.
Endpoints: kubectl get endpoints grokdevops shows 3 pod IPs, but one pod was recently restarted and may be serving stale routes.

Expected Investigation Path¶

# 1. Check ingress config
kubectl get ingress -n grokdevops -o yaml

# 2. Check endpoints — are all pods in the endpoint list?
kubectl get endpoints grokdevops -n grokdevops
kubectl get pods -n grokdevops -o wide

# 3. Check if some pods are not Ready
kubectl get pods -n grokdevops -o custom-columns='NAME:.metadata.name,READY:.status.containerStatuses[0].ready,IP:.status.podIP'

# 4. Check ingress controller logs
kubectl logs -n kube-system -l app.kubernetes.io/name=traefik --tail=50

# 5. Test individual pods
for ip in $(kubectl get endpoints grokdevops -n grokdevops -o jsonpath='{.subsets[0].addresses[*].ip}'); do
  echo "Testing $ip:"
  kubectl run curl-test-$RANDOM -n grokdevops --rm -i --restart=Never --image=curlimages/curl -- curl -s http://$ip:8000/health
done

Strong Answer¶

"Intermittent 404s with the ingress suggest the ingress controller is load-balancing across pods, and some of them are returning 404. This could mean: (1) some pods are running a different version with different routes — check if a rolling update is in progress; (2) the readiness probe is too lenient — pods are marked Ready before routes are registered; (3) one pod has a corrupted or misconfigured state. I'd check the endpoint list to see which pod IPs are included, then test each pod directly. If only some pods return 404, I'd compare their logs and configs. If all pods work individually, the issue might be in the ingress path configuration — specifically, pathType: Prefix vs Exact can cause unexpected behavior, and trailing slashes can matter depending on the ingress controller."

Common Traps¶

Blaming DNS — DNS issues cause connection failures, not 404s
Not testing individual pods — you need to isolate which pod(s) are misbehaving
Ignoring rolling updates — during an update, old and new pods coexist
Not checking pathType — Prefix vs Exact is a subtle but common source of 404s

Practice and Links¶

Runbook: training/library/runbooks/kubernetes/ingress_404.md
Drills: training/library/drills/kubectl_drills.md — Drill 7 (endpoints), Drill 23 (ingress rules)
Quest: training/interactive/exercises/levels/level-24/k8s-ingress/

API Gateways & Ingress (Topic Pack, L2) — Kubernetes Networking
Case Study: CNI Broken After Restart (Case Study, L2) — Kubernetes Networking
Case Study: Canary Deploy Routing to Wrong Backend — Ingress Misconfigured (Case Study, L2) — Kubernetes Networking
Case Study: CoreDNS Timeout Pod DNS (Case Study, L2) — Kubernetes Networking
Case Study: Grafana Dashboard Empty — Prometheus Blocked by NetworkPolicy (Case Study, L2) — Kubernetes Networking
Case Study: Service Mesh 503s — Envoy Misconfigured, RBAC Policy (Case Study, L2) — Kubernetes Networking
Case Study: Service No Endpoints (Case Study, L1) — Kubernetes Networking
Cilium & eBPF Networking (Topic Pack, L2) — Kubernetes Networking
Deep Dive: Kubernetes Networking (deep_dive, L2) — Kubernetes Networking
Docker Networking Flashcards (CLI) (flashcard_deck, L1) — Kubernetes Networking

Scenario: Ingress Returns 404 Intermittently¶

The Prompt¶

Initial Report¶

Constraints¶

Observable Evidence¶

Expected Investigation Path¶

Strong Answer¶

Common Traps¶

Practice and Links¶

Wiki Navigation¶

Pages that link here¶

Scenario: Ingress Returns 404 Intermittently¶

The Prompt¶

Initial Report¶

Constraints¶

Observable Evidence¶

Expected Investigation Path¶

Strong Answer¶

Common Traps¶

Practice and Links¶

Wiki Navigation¶

Related Content¶

Pages that link here¶