Solution¶

Triage¶

Check the Service and its selector:

kubectl describe svc user-service -n prod

Check the endpoints:

kubectl get endpoints user-service -n prod

List pods with their labels:

kubectl get pods -n prod -l app=user-service --show-labels
kubectl get pods -n prod --show-labels | grep user

Compare the selector label key/value with the actual pod labels.

Root Cause¶

The Service selector uses app: user-svc but the pods have the label app: user-service. This mismatch occurred when the junior engineer recreated the Service manifest from memory and used a shortened label value. Since no pods match the selector, the Endpoints object has empty subsets, and the Service has nothing to route traffic to.

The Service itself is functional (has a ClusterIP, port is configured), but with no backends, any connection to the ClusterIP is immediately refused.

Fix¶

Update the Service selector to match the pod labels:

kubectl patch svc user-service -n prod -p '{"spec":{"selector":{"app":"user-service"}}}'

Or edit the manifest and apply:

spec:
  selector:
    app: user-service  # was: user-svc

Verify endpoints are now populated:
```
kubectl get endpoints user-service -n prod
```
Expected output should show pod IPs and ports.

Test connectivity:

kubectl run test --rm -it --image=busybox -- wget -qO- http://user-service.prod.svc.cluster.local:8080/health

Rollback / Safety¶

Changing a Service selector is non-disruptive; it takes effect immediately.
If the wrong pods are selected, traffic could be routed to the wrong backend. Verify pod labels are unique per service.
Update the source manifest (Helm chart, Kustomize, etc.) to prevent the fix from being overwritten on next deploy.

Common Traps¶

Checking only the Service, not the Endpoints. Always check kubectl get endpoints first when debugging service connectivity.
Assuming pods are the problem when they show Running/Ready. If pods are healthy, the issue is almost always in the Service selector or readiness probes.
Label key typos vs. value typos. Both cause mismatches. App: user-service (capital A) does not match app: user-service.
Multiple labels in selector. The selector is AND logic. All labels must match. A single mismatch on any label key-value pair means no match.
Not checking targetPort. Even with correct selectors, if targetPort does not match the container's listening port, connections will be refused despite endpoints being populated.