Kubernetes Services & Ingress Footguns¶
[!WARNING] These will bite you in production. Every item here has caused real incidents.
1. Selector Mismatch Between Service and Deployment¶
The service has selector: {app: api-server}. The deployment has labels: {app: api}. One character difference. The service creates no endpoints. Traffic goes nowhere. kubectl get endpoints shows <none> but you don't think to check because the service and deployment both exist and look fine.
This is the single most common Kubernetes networking bug. It's always the labels.
Fix: Always verify endpoints after creating a service:
kubectl get endpoints api-server -n production
# If <none> → selectors don't match
kubectl get svc api-server -n production -o jsonpath='{.spec.selector}'
kubectl get pods -n production --show-labels
Use kubectl expose deployment api-server --port=80 --target-port=8000 to auto-generate a service with matching selectors.
2. externalTrafficPolicy: Local with Uneven Pod Distribution¶
You set externalTrafficPolicy: Local to preserve client source IPs. You have 3 nodes but all 5 replicas land on node-1 and node-2. Node-3 has zero pods. The cloud load balancer sends traffic to all three nodes equally. Every request to node-3 is dropped — there's no local pod to receive it.
Even worse: the health check endpoint for externalTrafficPolicy: Local is the NodePort health check (kube-proxy creates a health check service). If your LB health check interval is long, it takes minutes to stop sending traffic to empty nodes.
Fix: Combine externalTrafficPolicy: Local with topology spread constraints or pod anti-affinity to guarantee at least one pod per node:
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: api-server
3. Ingress Without TLS (Plaintext)¶
You create an Ingress without a TLS section. Traffic flows over HTTP in cleartext. Credentials, tokens, PII — all visible to anyone on the network path. You don't notice because the app works fine.
Fix: Always configure TLS. Use cert-manager for automatic Let's Encrypt certificates:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: api-ingress
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
tls:
- hosts:
- api.example.com
secretName: api-tls
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: api-server
port:
number: 80
Force HTTPS redirect:
4. Network Policy Default Deny Without Allowing DNS¶
You apply a default-deny network policy to lock down the namespace. Every pod in the namespace immediately loses DNS resolution. Services can't be discovered by name. HTTP requests to api-server:80 fail because the pod can't resolve the DNS name to a ClusterIP.
This is the most common network policy mistake and it's invisible until you try to make an outbound connection.
Fix: Always pair default-deny with a DNS egress rule:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
namespace: production
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to:
- namespaceSelector: {}
ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53
Apply this before or simultaneously with the default-deny policy.
5. Headless Service with Stateful Apps (No Load Balancing)¶
You create a headless service (clusterIP: None) for your database. Application code connects to db.production.svc.cluster.local. DNS returns all pod IPs. Most DNS clients pick the first one and cache it. All traffic goes to a single database pod. The other replicas sit idle.
Fix: Headless services are for StatefulSets where you need to address individual pods (e.g., postgres-0.db-headless). For load-balanced access, use a regular ClusterIP service. If you need client-side load balancing across a headless service, your application must implement it (round-robin across all returned IPs, or use a client library that supports it).
6. NodePort Range Conflict¶
You hardcode nodePort: 30080 in your service spec. Another team does the same thing. The second service creation fails with "provided port is already allocated." Nobody realizes the conflict until deployment day.
Or worse: you use a NodePort that conflicts with a port already in use on the host for something else (monitoring agent, host-level service).
Fix: Let Kubernetes auto-assign NodePorts. If you must pin them, maintain a registry of allocated ports. Better yet, use an Ingress or LoadBalancer service instead of relying on NodePorts.
7. Service Type LoadBalancer Cost (One LB per Service)¶
Each LoadBalancer service provisions a separate cloud load balancer. At $20-50/month per LB (depending on cloud provider), 20 services = $400-1000/month just for load balancers. Plus each LB is a separate public IP with its own DNS entry.
Fix: Use a single ingress controller (one LoadBalancer service) and route traffic to multiple backend services via Ingress or HTTPRoute rules. This gives you one LB, one IP, and host/path-based routing:
# One ingress, many backends
spec:
rules:
- host: api.example.com
http:
paths:
- path: /
backend:
service:
name: api-server
- host: admin.example.com
http:
paths:
- path: /
backend:
service:
name: admin-app
8. ndots:5 Causing Excessive DNS Lookups¶
The default ndots:5 means any hostname with fewer than 5 dots triggers search domain expansion. Resolving api.example.com (2 dots) generates 4 DNS queries before finally resolving the actual hostname.
At scale — thousands of pods making frequent external DNS lookups — this 4x amplification hammers CoreDNS. CoreDNS gets CPU-throttled, DNS latency spikes, and every service in the cluster slows down because service discovery depends on DNS.
Fix: For pods that primarily talk to external services, lower ndots:
Or use fully qualified domain names with a trailing dot: api.example.com. bypasses search domain expansion entirely.
For CoreDNS, add the autopath plugin to short-circuit the search chain server-side.
9. Ingress Path Type Differences¶
You set pathType: Prefix with path: /api. You expect it to match /api and /api/users. It also matches /api-docs, /api2, and /api-anything-else because Prefix matches character-by-character, not path-segment-by-path-segment in some controllers.
The exact behavior depends on the ingress controller implementation and the path type:
| Controller | Prefix /api matches /api-docs? |
|---|---|
| NGINX Ingress | Yes (character prefix) |
| Traefik | No (path segment boundary) |
| AWS ALB | Depends on configuration |
Fix: Use pathType: Exact when you need exact matches. For prefix matching that respects path segments, add a trailing slash (/api/) or use regex-based matching (controller-specific). With Gateway API, PathPrefix is segment-aware by specification.
10. Forgetting to Create IngressClass¶
You apply an Ingress resource with ingressClassName: nginx. But no IngressClass named nginx exists in the cluster. The ingress controller ignores the resource. No error, no warning — the ingress just silently does nothing.
Or: you have two ingress controllers (nginx and traefik) and no IngressClass is marked as default. Ingresses without ingressClassName get processed by neither controller.
Fix: Always verify IngressClasses exist before creating Ingress resources:
kubectl get ingressclass
# Mark one as default
kubectl annotate ingressclass nginx \
ingressclass.kubernetes.io/is-default-class=true
11. Session Affinity with Connection Pooling¶
You enable sessionAffinity: ClientIP to pin users to the same backend. But your reverse proxy or service mesh maintains a connection pool with keep-alive connections. All requests from the proxy come from the same IP. Every user gets pinned to the same pod. Your 10-replica deployment has 1 overloaded pod and 9 idle ones.
Fix: Kubernetes service-level session affinity is source-IP based and doesn't work well behind proxies. If you need session affinity, implement it at the ingress layer with cookie-based affinity:
# NGINX Ingress cookie-based affinity
annotations:
nginx.ingress.kubernetes.io/affinity: "cookie"
nginx.ingress.kubernetes.io/session-cookie-name: "route"
nginx.ingress.kubernetes.io/session-cookie-max-age: "3600"
This sets a cookie on the client that pins them to a specific backend regardless of source IP.