Load Balancing¶

29 cards — 🟢 5 easy | 🟡 9 medium | 🔴 6 hard

🟢 Easy (5)¶

1. What are the two main configuration sections in HAProxy that define how traffic flows?

Show answer

Frontend (binds to addresses, applies ACLs and routing rules) and Backend (defines server pools, health checks, and balancing algorithms).

Remember: LB distributes for availability+performance. Key: algorithm + L4 vs L7.

Example: HAProxy, nginx (open-source). Cloud: AWS ALB/NLB, GCP LB.

2. In Nginx, what directive is used to define a pool of backend servers for load balancing?

Show answer

The upstream block (e.g., upstream app_backend { server 10.0.1.10:8080; server 10.0.1.11:8080; }).

Remember: LB distributes for availability+performance. Key: algorithm + L4 vs L7.

Example: HAProxy, nginx (open-source). Cloud: AWS ALB/NLB, GCP LB.

3. What is the difference between roundrobin and leastconn load-balancing algorithms?

Show answer

Roundrobin distributes requests evenly across servers in rotation. Leastconn sends each new request to the server with the fewest active connections, which is better for long-running requests.

Remember: Algorithms: Round Robin, Least Connections, IP Hash, Weighted. "RLIW."

4. What is the difference between L4 and L7 load balancing?

Show answer

L4 (transport layer) routes based on IP and port — fast, no payload inspection, supports any TCP/UDP protocol. L7 (application layer) routes based on HTTP headers, URLs, cookies — enables content-based routing, header manipulation, and caching but adds latency. Use L4 for raw throughput and non-HTTP protocols; L7 for HTTP-aware routing and observability.

Remember: L4=TCP/fast/dumb. L7=HTTP/smart/slower. L4 for speed, L7 for routing.

5. Why terminate TLS at the load balancer rather than at each backend?

Show answer

Centralized certificate management (one place to update certs), reduced CPU load on backends (TLS handshakes are expensive), ability to inspect and route based on HTTP content (impossible with TLS passthrough), and simplified monitoring. The trade-off: traffic between the LB and backends is unencrypted unless you use re-encryption.

Remember: LB distributes for availability+performance. Key: algorithm + L4 vs L7.

Example: HAProxy, nginx (open-source). Cloud: AWS ALB/NLB, GCP LB.

🟡 Medium (9)¶

1. In HAProxy, what do the parameters "inter 5s fall 3 rise 2" mean on a server health check line?

Show answer

inter 5s means check every 5 seconds. fall 3 means mark the server unhealthy after 3 consecutive failures. rise 2 means mark it healthy again after 2 consecutive successes.

Remember: Active(LB polls) vs passive(monitor responses). Failed=removed from pool.

2. How does HAProxy implement session persistence (sticky sessions) using cookies?

Show answer

By adding a cookie directive in the backend (cookie SERVERID insert indirect nocache) and assigning each server a cookie value (server app1 ... cookie s1). HAProxy inserts a cookie that ties the client to a specific backend server.

Remember: Sticky sessions pin client→server. Breaks LB but needed for stateful apps.

3. What are the three TLS termination patterns for load balancers, and what is the trade-off of each?

Show answer

1) TLS at LB: simple cert management but backend traffic is unencrypted. 2) TLS passthrough (mode tcp): end-to-end encryption but no L7 routing. 3) TLS re-encryption: end-to-end encryption with L7 routing but double TLS overhead and two cert sets.

Remember: LB distributes for availability+performance. Key: algorithm + L4 vs L7.

Example: HAProxy, nginx (open-source). Cloud: AWS ALB/NLB, GCP LB.

4. How do you configure rate limiting in Nginx using limit_req_zone?

Show answer

Define a zone in the http block (limit_req_zone $binary_remote_addr zone=general:10m rate=10r/s) then apply it in a location block (limit_req zone=general burst=20 nodelay). The burst parameter allows temporary spikes; nodelay processes burst requests immediately.

Remember: LB distributes for availability+performance. Key: algorithm + L4 vs L7.

Example: HAProxy, nginx (open-source). Cloud: AWS ALB/NLB, GCP LB.

5. What is the difference between active and passive health checks in load balancers?

Show answer

Active health checks probe backends at regular intervals (e.g., HTTP GET /health every 5s). Passive health checks monitor real traffic for failures (e.g., 3 consecutive 502s marks a server down). Active catches issues before users do but adds probe traffic. Passive has zero overhead but only detects failures after users are affected. Best practice: use both together.

Remember: Active(LB polls) vs passive(monitor responses). Failed=removed from pool.

6. When should you use ip_hash vs least_conn vs round_robin?

Show answer

round_robin: default, works for stateless services with similar backend capacity.
least_conn: best for long-lived connections (WebSockets, gRPC) or uneven request durations.
ip_hash: provides session persistence without cookies but can cause imbalance when clients share NAT. Choose based on whether your app is stateless, has varying request times, or requires session affinity.

Remember: LB distributes for availability+performance. Key: algorithm + L4 vs L7.

Example: HAProxy, nginx (open-source). Cloud: AWS ALB/NLB, GCP LB.

7. What is the difference between active-passive and active-active load balancer setups?

Show answer

Active-passive: one LB handles traffic, the standby takes over on failure via keepalived/VRRP. Simple but wastes standby resources.
Active-active: multiple LBs share traffic (via DNS round-robin or anycast). Better resource utilization and throughput but requires shared state for sticky sessions and more complex health checking. Active-active is preferred for high-traffic production systems.

Remember: LB distributes for availability+performance. Key: algorithm + L4 vs L7.

Example: HAProxy, nginx (open-source). Cloud: AWS ALB/NLB, GCP LB.

8. Why is consistent hashing preferred over round-robin for cache-backed services?

Show answer

Consistent hashing routes the same key to the same backend, preserving cache locality. Round-robin spreads requests evenly but destroys cache hit rates. When a node is added or removed, consistent hashing only remaps ~1/N of keys instead of all of them.

Remember: LB distributes for availability+performance. Key: algorithm + L4 vs L7.

Example: HAProxy, nginx (open-source). Cloud: AWS ALB/NLB, GCP LB.

9. How does HAProxy's maxconn protect backends from overload?

Show answer

maxconn on a server line caps concurrent connections to that backend. Excess connections queue in HAProxy until a slot opens, with a configurable timeout (timeout queue). This prevents slow backends from being overwhelmed while the proxy absorbs bursts.

Remember: LB distributes for availability+performance. Key: algorithm + L4 vs L7.

Example: HAProxy, nginx (open-source). Cloud: AWS ALB/NLB, GCP LB.

🔴 Hard (6)¶

1. How do you gracefully drain a backend server in HAProxy without dropping active connections?

Show answer

Use the runtime API via the stats socket: echo "set server app_servers/app1 state drain" | socat stdio /var/run/haproxy.sock. This stops new connections while allowing existing ones to complete. Use "state maint" for full disable and "state ready" to re-enable.

Remember: LB distributes for availability+performance. Key: algorithm + L4 vs L7.

Example: HAProxy, nginx (open-source). Cloud: AWS ALB/NLB, GCP LB.

2. How can HAProxy stick tables be used for rate limiting, and what happens when the limit is exceeded?

Show answer

Define a stick-table on the frontend (stick-table type ip size 100k expire 30s store http_req_rate(10s)), track source IPs (http-request track-sc0 src), and deny requests exceeding the threshold (http-request deny deny_status 429 if { sc_http_req_rate(0) gt 100 }). Clients exceeding the rate get HTTP 429 responses.

Remember: LB distributes for availability+performance. Key: algorithm + L4 vs L7.

Example: HAProxy, nginx (open-source). Cloud: AWS ALB/NLB, GCP LB.

3. How do you implement a canary deployment using HAProxy weight shifting via the runtime API?

Show answer

Start with blue at weight 100 and green at weight 0. Gradually shift traffic using the stats socket: "set weight app_servers/green 10", then 50, then 100, and finally "set weight app_servers/blue 0". This progressively routes traffic to the new version without config reload.

Remember: LB distributes for availability+performance. Key: algorithm + L4 vs L7.

Example: HAProxy, nginx (open-source). Cloud: AWS ALB/NLB, GCP LB.

4. What is connection draining and why is it critical during deployments?

Show answer

Connection draining allows in-flight requests to complete before removing a backend from the pool. Without it, active connections are terminated mid-request, causing errors. In HAProxy, set server state to drain. In Kubernetes, the terminationGracePeriodSeconds and preStop hooks serve this purpose. Typical drain timeout: 30-60 seconds.

Remember: LB distributes for availability+performance. Key: algorithm + L4 vs L7.

Example: HAProxy, nginx (open-source). Cloud: AWS ALB/NLB, GCP LB.

5. What are the trade-offs of different session persistence methods?

Show answer

Cookie-based: most reliable, works through NAT, but requires L7 and adds cookie overhead.
IP-hash: no cookie needed, but clients behind NAT share the same backend, causing imbalance.
URL parameter: application-specific, adds complexity.
No persistence: best for stateless services — simplest, most even distribution. Always prefer stateless architectures when possible to avoid persistence complexity entirely.

Remember: Sticky sessions pin client→server. Breaks LB but needed for stateful apps.

6. What problem does the PROXY protocol solve and how does it work?

Show answer

When a Layer 4 proxy terminates TCP, the backend loses the real client IP. PROXY protocol prepends a one-line header (v1 text or v2 binary) with client IP/port before the first data byte. Both proxy and backend must enable it — a mismatch breaks the connection.

Remember: LB distributes for availability+performance. Key: algorithm + L4 vs L7.

Example: HAProxy, nginx (open-source). Cloud: AWS ALB/NLB, GCP LB.