Portal | Level: L3: Advanced | Topics: Service Mesh | Domain: Kubernetes
Service Mesh Drills¶
Remember: A service mesh adds three capabilities to your cluster: mTLS (encryption between services), Observability (automatic metrics, traces, and access logs for every request), and Traffic management (retries, timeouts, circuit breaking, canary deploys). Mnemonic: "MOT" — Mutual TLS, Observability, Traffic. The sidecar proxy (Envoy) intercepts all traffic transparently — no application code changes needed.
Gotcha: The Istio sidecar needs to be running before your application starts making requests. If your app starts faster than the sidecar, outbound requests fail with "connection refused." Fix: set
holdApplicationUntilProxyStarts: truein the Istio mesh config, or add an init container that waits for the sidecar to be ready.
Drill 1: Enable Sidecar Injection¶
Difficulty: Easy
Q: How do you enable automatic Istio sidecar injection for the production namespace?
Answer
Pods created after labeling get automatic injection. Existing pods need a restart.Drill 2: Diagnose 503 After Mesh Enable¶
Difficulty: Medium
Q: After enabling Istio, all requests return 503. Pods show 2/2 Ready. App logs show no incoming traffic. What do you check?
Answer
# 1. Check istio-proxy logs
kubectl logs deploy/my-app -n production -c istio-proxy --tail=50
# Look for: "upstream connect error or disconnect/reset before headers"
# 2. Run Istio analysis
istioctl analyze -n production
# 3. Check Service port naming — MOST COMMON CAUSE
kubectl get svc my-app -n production -o yaml | grep -A5 ports:
# Port must be named with protocol prefix: http-web, grpc-api, tcp-db
# NOT just "web" or "api"
# 4. Fix
kubectl patch svc my-app -n production --type=json \
-p='[{"op":"replace","path":"/spec/ports/0/name","value":"http-web"}]'
# 5. Check mTLS mode
kubectl get peerauthentication -A
# STRICT mode blocks non-mesh clients
Drill 3: Canary Deployment with Traffic Splitting¶
Difficulty: Medium
Q: Route 90% of traffic to v1 and 10% to v2 of my-app using Istio.
Answer
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: my-app
namespace: production
spec:
host: my-app
subsets:
- name: v1
labels:
version: v1
- name: v2
labels:
version: v2
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: my-app
namespace: production
spec:
hosts:
- my-app
http:
- route:
- destination:
host: my-app
subset: v1
weight: 90
- destination:
host: my-app
subset: v2
weight: 10
Drill 4: Header-Based Routing¶
Difficulty: Medium
Q: Route requests with header x-env: canary to the v2 subset, all other traffic to v1.
Answer
Order matters: specific matches first, default route last.Drill 5: Circuit Breaker¶
Difficulty: Medium
Q: Configure outlier detection to eject endpoints that return 3+ consecutive 5xx errors, checked every 30 seconds, ejected for 1 minute.
Answer
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: my-app
spec:
host: my-app
trafficPolicy:
outlierDetection:
consecutive5xxErrors: 3
interval: 30s
baseEjectionTime: 60s
maxEjectionPercent: 50
connectionPool:
tcp:
maxConnections: 100
http:
http1MaxPendingRequests: 100
http2MaxRequests: 1000
Drill 6: Fault Injection for Testing¶
Difficulty: Easy
Q: Inject a 5-second delay into 10% of requests and return 503 for 5% of requests to test resilience.
Answer
Use this to test: - Timeout handling in upstream services - Retry logic - Circuit breaker behavior - User-facing error handlingDrill 7: Debug Proxy Configuration¶
Difficulty: Hard
Q: Traffic to backend-svc returns 404 even though the Service exists. How do you debug the Envoy proxy config?
Answer
# 1. Check proxy sync status
istioctl proxy-status
# Look for SYNCED status. If not synced, config hasn't propagated.
# 2. Check routes in the sidecar
istioctl proxy-config routes deploy/my-app -n production
# Look for backend-svc in the route table
# 3. Check clusters (upstream endpoints)
istioctl proxy-config clusters deploy/my-app -n production | grep backend-svc
# 4. Check endpoints
istioctl proxy-config endpoints deploy/my-app -n production | grep backend-svc
# Are there any endpoints? Are they HEALTHY?
# 5. Check listeners
istioctl proxy-config listeners deploy/my-app -n production
# 6. Full config dump
istioctl proxy-config all deploy/my-app -n production -o json
Drill 8: mTLS Verification¶
Difficulty: Medium
Q: How do you verify that mTLS is actually enabled between two services?
Answer
# 1. Check PeerAuthentication policy
kubectl get peerauthentication -A
# 2. Check what mode is effective for a workload
istioctl authn tls-check <pod-name> <service-name>.production.svc.cluster.local
# 3. Check the proxy config for mTLS settings
istioctl proxy-config clusters deploy/my-app -n production -o json | \
jq '.[] | select(.name | contains("backend-svc")) | .transportSocket'
# 4. Verify with Kiali dashboard (if installed)
# Shows lock icon on edges between services
# 5. Check istio-proxy logs for TLS handshake
kubectl logs deploy/my-app -c istio-proxy | grep -i tls
Drill 9: Retry Configuration¶
Difficulty: Easy
Q: Configure Istio to retry failed requests to backend-svc up to 3 times with a 2-second timeout per attempt.
Answer
`retryOn` options: - `5xx` — retry on 5xx responses - `gateway-error` — 502, 503, 504 - `connect-failure` — connection failed - `refused-stream` — REFUSED_STREAM error - `retriable-4xx` — retry on 409 - `reset` — connection resetDrill 10: Sidecar Resource for Namespace Isolation¶
Difficulty: Hard
Q: Limit the payment namespace to only communicate with payment and database namespaces. Block all other egress.
Answer
apiVersion: networking.istio.io/v1beta1
kind: Sidecar
metadata:
name: default
namespace: payment
spec:
egress:
- hosts:
- "./*" # Same namespace
- "database/*" # Database namespace
- "istio-system/*" # Required for mesh function
outboundTrafficPolicy:
mode: REGISTRY_ONLY # Block unknown destinations
Wiki Navigation¶
Prerequisites¶
- Service Mesh (Topic Pack, L3)
Related Content¶
- Interview: Service Mesh 503s (Scenario, L3) — Service Mesh
- Runbook: Istio 503 Errors (Runbook, L3) — Service Mesh
- Service Mesh (Topic Pack, L3) — Service Mesh
- Service Mesh Flashcards (CLI) (flashcard_deck, L1) — Service Mesh
- Skillcheck: Service Mesh (Assessment, L3) — Service Mesh