Pattern: Port-Forward as Permanent Fix¶
ID: FP-049 Family: Human Error Amplifier Frequency: Common Blast Radius: Single Service Detection Difficulty: Obvious (when it breaks again)
The Shape¶
During an incident, an engineer uses kubectl port-forward to bypass a broken Ingress,
Service, or load balancer. Traffic flows; the immediate crisis is resolved. The engineer
closes their laptop for the night. The port-forward process (which lives in the engineer's
terminal session) terminates. The service becomes unreachable again. The fix was ephemeral
but was treated as permanent, creating a false sense of resolution and a repeat incident
hours later.
How You'll See It¶
In Kubernetes¶
# "Fix" during 3am incident:
kubectl port-forward svc/payment-service 8080:80 &
# Payment service accessible. Monitoring turns green. Incident resolved?
# 4 hours later (engineer's session ended):
# Connection refused. Incident re-opens.
In Linux/Infrastructure¶
SSH tunnel (ssh -L 5432:db-server:5432 jump-host) used to connect an application to
a database when the direct connection was broken. The tunnel worked; the team moved on.
The SSH connection dropped (idle timeout, network issue). The application lost database
connectivity again.
In CI/CD¶
A CI step uses kubectl port-forward to access a service for testing. The port-forward
is started in a background process. If the test takes longer than the port-forward
timeout, the test fails intermittently.
The Tell¶
The incident was "resolved" but the same incident recurred hours later. No changes were made to Ingress, Service, or NetworkPolicy resources. A port-forward process was running during the "resolved" window. The recurrence happened exactly when the engineer's session or terminal was closed.
Common Misdiagnosis¶
| Looks Like | But Actually | How to Tell the Difference |
|---|---|---|
| Intermittent network issue | Port-forward died | Correlates with terminal session ending, not with network events |
| Service instability | Unstable fix (port-forward) | Service itself is stable; the bypass mechanism is what's intermittent |
| Resolved incident re-opening | Was never fixed; bypass expired | Root cause (Ingress/Service issue) is still present in the config |
The Fix (Generic)¶
- Immediate: Fix the actual Ingress/Service/NetworkPolicy/DNS issue; use port-forward only for diagnosis, never as a fix.
- Short-term: Before closing an incident, validate that traffic flows through the actual production path (not via port-forward); run an end-to-end test that exercises the real path.
- Long-term: Add to incident runbooks: "port-forward is a diagnostic tool, not a fix; the incident is not resolved until traffic flows through the actual service mesh path."
Real-World Examples¶
- Example 1: Ingress controller misconfigured after a certificate renewal. Engineer port-forwarded directly to the pod. "Incident resolved." 6 hours later, engineer's laptop closed, port-forward died, Ingress still broken. Incident re-opened.
- Example 2: Service selector label mismatch (pod labels changed but Service selector wasn't updated). Port-forward bypassed the service selector issue. "Fix" lasted 8 hours (until next deployment restart the engineer's session).
War Story¶
My worst 3am. Fixed a payment outage at 3am with port-forward. Wrote "resolved" in the incident. Went to sleep. 7am: same outage, fresh engineers, same confusion. They couldn't figure out "why it was working at 3am and broken now." My port-forward had been the bridge. I had left the terminal open, gone to sleep, laptop hibernated. Port-forward died. The Ingress was still broken. I hadn't touched the Ingress at all — I'd just bypassed it. New rule for me: I don't close an incident until I can
curlthrough the actual service URL (not localhost:8080). Port-forward is for diagnosing, not fixing.
Cross-References¶
- Topic Packs: k8s-ops, incident-command
- Footguns: k8s-ops/footguns.md — "Port-forward as a 'fix'"
- Related Patterns: FP-051 (missing escalation — same "incident prematurely closed" pattern), FP-024 (health check lying — another "appears fixed but isn't")