Skip to content

Diagnostic Questions

Before revealing the investigation path:

  1. The Envoy sidecar shows "no healthy upstream" for inventory-service, but direct pod-to-pod calls (bypassing the mesh) work fine. What does this tell you about where the problem is — the network, the service, or the mesh control plane?

  2. istioctl proxy-status shows EDS as STALE for the order-service proxy. What does stale EDS mean, and what could prevent Istiod from pushing endpoint updates?

  3. Istiod logs show "forbidden: cannot list endpoints in namespace prod." What Kubernetes resource controls this permission, and how would you check if it exists and is correctly configured?

  4. The ClusterRole was deleted by an automated RBAC cleanup controller. Why is the fix a security domain change (exclude system roles from cleanup) rather than just recreating the role (kubernetes) or fixing the mesh (networking)?

  5. How would you design a safe RBAC cleanup process that hardens security without breaking system components like Istio? What safeguards and exclusion mechanisms would you use?