gRPC - Street-Level Ops¶
Quick Diagnosis Commands¶
# List all services on a gRPC server (server reflection must be enabled)
grpcurl -plaintext localhost:50051 list
# List methods in a service
grpcurl -plaintext localhost:50051 list helloworld.Greeter
# Describe a service (show proto schema)
grpcurl -plaintext localhost:50051 describe helloworld.Greeter
# Describe a message type
grpcurl -plaintext localhost:50051 describe helloworld.HelloRequest
# Call a unary RPC
grpcurl -plaintext -d '{"name": "world"}' localhost:50051 helloworld.Greeter/SayHello
# Call with TLS
grpcurl -d '{"name": "world"}' myserver.example.com:443 helloworld.Greeter/SayHello
# Call with TLS but skip cert verification (dev only)
grpcurl -insecure -d '{"name": "world"}' myserver:443 helloworld.Greeter/SayHello
# Call with metadata (headers)
grpcurl -plaintext \
-H 'authorization: Bearer my-token' \
-H 'x-trace-id: abc123' \
-d '{"name": "world"}' \
localhost:50051 helloworld.Greeter/SayHello
# Call with proto file (if server reflection not enabled)
grpcurl -plaintext -proto ./api.proto \
-d '{"name": "world"}' localhost:50051 helloworld.Greeter/SayHello
# Call with proto import path
grpcurl -plaintext \
-import-path ./proto \
-proto helloworld/helloworld.proto \
-d '{"name": "world"}' localhost:50051 helloworld.Greeter/SayHello
# Stream an RPC (server streaming)
grpcurl -plaintext -d '{"query": "foo"}' localhost:50051 myservice.Search/StreamResults
# Verbose output (see request/response metadata and timing)
grpcurl -plaintext -v -d '{"name": "world"}' localhost:50051 helloworld.Greeter/SayHello
# Health check using grpc_health_probe
grpc_health_probe -addr=localhost:50051
# Health check with TLS
grpc_health_probe -addr=myserver:443 -tls
# Health check for a specific service
grpc_health_probe -addr=localhost:50051 -service=helloworld.Greeter
# Health check with timeout
grpc_health_probe -addr=localhost:50051 -connect-timeout 5s -rpc-timeout 5s
# Kubernetes liveness probe using grpc_health_probe
# In pod spec:
# livenessProbe:
# exec:
# command: ["/bin/grpc_health_probe", "-addr=:50051"]
# initialDelaySeconds: 5
# Use kubectl port-forward to reach a pod's gRPC port
kubectl port-forward pod/my-grpc-pod 50051:50051 &
grpcurl -plaintext localhost:50051 list
# Check gRPC error code from a failed call
grpcurl -plaintext -d '{}' localhost:50051 myservice.MyService/GetThing 2>&1
# Output will include: "Code: <code_name>" and the error message
# Common gRPC status codes:
# 0 = OK
# 1 = CANCELLED
# 2 = UNKNOWN (server error without specific code)
# 3 = INVALID_ARGUMENT (bad request)
# 4 = DEADLINE_EXCEEDED (timeout)
# 5 = NOT_FOUND
# 7 = PERMISSION_DENIED
# 8 = RESOURCE_EXHAUSTED (rate limited, quota exceeded)
# 12 = UNIMPLEMENTED (method doesn't exist)
# 13 = INTERNAL (server bug)
# 14 = UNAVAILABLE (server down or overloaded)
# Capture gRPC traffic for inspection (plaintext only)
tcpdump -i eth0 -w /tmp/grpc.pcap 'tcp port 50051'
tshark -r /tmp/grpc.pcap -Y 'grpc' -T fields -e grpc.message_length -e grpc.status_code
Common Scenarios¶
Scenario 1: Client gets DEADLINE_EXCEEDED — tracing the deadline chain¶
# DEADLINE_EXCEEDED means the client's deadline expired before the server responded.
# The deadline is set by the CALLING code, not the server.
# Check what deadline the client is sending (look for grpc-timeout header)
grpcurl -plaintext -v -d '{}' localhost:50051 myservice.MyService/SlowCall 2>&1 | grep -i timeout
# Or capture it:
tshark -r /tmp/grpc.pcap -Y 'http2.header.name == "grpc-timeout"' \
-T fields -e http2.header.value
# Common pattern: deadline not propagated through service chains
# Service A calls Service B, but doesn't pass the context deadline.
# Service A times out waiting for B, but B keeps running (wasted work).
# In the server logs, look for: context deadline exceeded or context canceled
# If you see both in a trace, the parent canceled and propagated through.
# Verify the deadline is reasonable for the operation:
# grpc-timeout header format: 100m = 100 milliseconds, 5S = 5 seconds, 1M = 1 minute
Scenario 2: Server reflection not enabled — grpcurl can't list services¶
# Error: "Failed to list services: server does not support the reflection API"
# The server needs to register the reflection service.
# In Go: add this to your server setup:
# import "google.golang.org/grpc/reflection"
# reflection.Register(s)
# In Python:
# from grpc_reflection.v1alpha import reflection
# reflection.enable_server_reflection(SERVICE_NAMES, server)
# Workaround without reflection — use proto files:
grpcurl -plaintext \
-import-path /path/to/proto \
-proto myservice.proto \
localhost:50051 list
# Or use a protoset (compiled proto descriptor):
# Generate protoset:
protoc --descriptor_set_out=myservice.protoset \
--include_imports --proto_path=. myservice.proto
# Use with grpcurl:
grpcurl -plaintext -protoset myservice.protoset localhost:50051 list
Scenario 3: Load balancer dropping gRPC connections — HTTP/2 multiplexing issue¶
Under the hood: gRPC uses HTTP/2, which multiplexes many RPCs over a single TCP connection. A Layer 4 load balancer sees one TCP connection and sends all traffic to one backend. This is why gRPC requires Layer 7 (application-aware) load balancing -- Envoy, ALB with gRPC target group, or client-side balancing.
# Symptom: gRPC calls work fine at low load, fail or get stuck under load.
# Root cause: Layer 4 load balancers (TCP/NLB) see one long-lived TCP connection
# per client and can't distribute individual RPC calls across backends.
# All RPCs from one client go to one backend.
# Verify using grpcurl: check which backend you're hitting
grpcurl -plaintext -d '{}' load-balancer:50051 myservice.MyService/GetServerId
# If you always get the same server ID, you're stuck on one backend.
# Solutions:
# 1. Use a Layer 7 gRPC-aware load balancer (Envoy, nginx with grpc_pass,
# GCP/AWS gRPC load balancing).
# 2. Use client-side load balancing (gRPC built-in, requires service discovery).
# 3. Add a headless Kubernetes Service + gRPC client load balancing via DNS.
# Check if your Kubernetes Service is headless (for client-side LB):
kubectl get svc my-grpc-service -o jsonpath='{.spec.clusterIP}'
# "None" = headless (DNS returns all pod IPs, client picks one)
# An IP = regular service (one stable IP, L4 load balancing)
# For Envoy: verify the cluster is configured as http2 not http/1.1
# curl http://envoy-admin:9901/clusters | grep -A5 my-grpc-cluster
Scenario 4: Debugging connection pool exhaustion and UNAVAILABLE errors¶
# UNAVAILABLE is the gRPC status when the server can't accept the connection
# or the connection is dropped mid-stream.
# Check server's connection count
ss -tp 'sport = :50051' | wc -l
# Check if server's port is accepting connections
curl -v telnet://localhost:50051 # should connect immediately
# Check for max concurrent streams limit being hit (HTTP/2 setting)
# Default is usually 100 per connection
tshark -r /tmp/grpc.pcap -Y 'http2.type == 4' -V | grep -i 'max_concurrent'
# In Go server, check the MaxConcurrentStreams setting:
# grpc.MaxConcurrentStreams(200)
# Check client connection pool — are clients creating too many connections?
ss -tp 'dport = :50051' | awk '{print $4}' | sort | uniq -c | sort -rn
# If one client IP has 50+ connections, the client is not reusing connections
# Verify Keep-Alive settings (gRPC uses HTTP/2 PING to keep connections alive)
# If keep-alive is not configured, connections idle out after ~4 hours on Linux
# On the client, look for: keepalive.ClientParameters
# On the server, look for: keepalive.ServerParameters and EnforcementPolicy
Key Patterns¶
grpcurl exploration workflow¶
1. Check if server has reflection:
grpcurl -plaintext localhost:50051 list
2. If no reflection, find the proto files or protoset.
3. Explore service structure:
grpcurl -plaintext localhost:50051 list mypackage.MyService
grpcurl -plaintext localhost:50051 describe mypackage.MyService
4. Understand the request message:
grpcurl -plaintext localhost:50051 describe mypackage.MyRequest
5. Make a call with proper JSON:
grpcurl -plaintext -d '{"field": "value"}' localhost:50051 mypackage.MyService/Method
6. Debug with verbose output:
grpcurl -plaintext -v -d '{"field": "value"}' localhost:50051 mypackage.MyService/Method
# Shows: request metadata, response metadata, timing, status code
Reading gRPC error codes in practice¶
gRPC Status Codes — what they mean operationally:
OK (0) — Success. No issue.
CANCELLED (1) — Client canceled. Check client-side timeout or user action.
DEADLINE_EXCEEDED (4) — Client's deadline expired. Check: is the deadline too short?
Is the server too slow? Is there a missing deadline propagation?
NOT_FOUND (5) — Resource doesn't exist. Treat like HTTP 404.
PERMISSION_DENIED (7) — AuthZ failure. Wrong role/policy, not missing auth.
UNAUTHENTICATED (16) — Missing or invalid credentials. Token expired? Wrong audience?
RESOURCE_EXHAUSTED (8) — Rate limit, quota, or server at capacity.
Check: server-side rate limiting, queue depth.
UNAVAILABLE (14) — Server is down or overloaded. Retry with backoff.
Also happens during rolling deployments — build in retry logic.
INTERNAL (13) — Server bug. Check server logs for stack trace.
UNIMPLEMENTED (12) — Method not found. Wrong service, wrong proto version.
For DEADLINE_EXCEEDED: determine who set the deadline (client or upstream).
For UNAVAILABLE: always retry with exponential backoff — it's designed to be retried.
For INTERNAL: never retry automatically — it may not be idempotent.
> **Remember:** PERMISSION_DENIED (7) vs UNAUTHENTICATED (16) -- UNAUTHENTICATED means "who are you?" (missing/invalid token). PERMISSION_DENIED means "I know who you are, but you can't do this" (wrong role). Confusing them leads to debugging auth when the issue is authz, or vice versa.
Interceptors for logging and tracing¶
// Server-side logging interceptor pattern (Go)
func loggingInterceptor(ctx context.Context, req interface{},
info *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (interface{}, error) {
start := time.Now()
resp, err := handler(ctx, req)
duration := time.Since(start)
code := status.Code(err)
log.Printf("method=%s duration=%s code=%s", info.FullMethod, duration, code)
return resp, err
}
// Register it:
// grpc.NewServer(grpc.UnaryInterceptor(loggingInterceptor))
// For distributed tracing, use OpenTelemetry gRPC instrumentation:
// go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc
// This automatically propagates trace context via gRPC metadata.
Deadline propagation pattern¶
Gotcha: Deadline propagation is the #1 source of resource waste in gRPC service meshes. If Service A sets a 500ms deadline and Service B creates a new
context.Background(), Service C (called by B) runs to completion even after A gave up. Multiply this across thousands of requests per second and you are burning CPU on work nobody is waiting for.
The correct pattern: always pass ctx through, never create a new background context.
BAD:
func (s *Server) GetUser(ctx context.Context, req *pb.GetUserRequest) (*pb.User, error) {
// Creates new context — loses the caller's deadline!
dbCtx := context.Background()
user, err := s.db.GetUser(dbCtx, req.Id)
...
}
GOOD:
func (s *Server) GetUser(ctx context.Context, req *pb.GetUserRequest) (*pb.User, error) {
// Passes caller's context — deadline propagates to the DB call
user, err := s.db.GetUser(ctx, req.Id)
...
}
This is critical in service meshes: if Service A calls B calls C, and A's deadline
is 500ms, that deadline should propagate. If B creates a new context, C gets
an infinite deadline, keeps running after A gives up, and wastes resources.