Portal | Level: L1: Foundations | Topics: TCP/IP, DNS, Linux Networking Tools | Domain: Networking
Networking Drills¶
Remember: The network debugging order: DNS -> Routing -> Firewall -> Application. Most connectivity issues are DNS (wrong name, stale cache, CoreDNS down) or firewall (SecurityGroup, NetworkPolicy, iptables). Mnemonic: "DRFA" — always start at the bottom of the stack and work up.
Gotcha: In Kubernetes,
curl: connection refusedandcurl: connection timed outmean very different things. Refused = the target host is reachable but nothing listens on that port (check if the process is running). Timed out = packets are being dropped (check NetworkPolicy, security groups, or routing). Never debug the application when the problem is the network.Under the hood: Kubernetes Services are implemented by kube-proxy writing iptables/IPVS rules on every node. When you
curl svc-name:port, the kernel intercepts the packet at the ClusterIP and DNAT's it to a random backend pod IP. If endpoints are empty (label mismatch), the connection is refused — not timed out — because the iptables rule sends a TCP RST.
Drill 1: DNS Resolution¶
Difficulty: Easy
Q: A pod can't resolve backend-svc. Walk through the DNS resolution chain in Kubernetes.
Answer
Pod → /etc/resolv.conf → CoreDNS (kube-dns service at 10.96.0.10)
1. Try: backend-svc.same-namespace.svc.cluster.local
2. Try: backend-svc.svc.cluster.local
3. Try: backend-svc.cluster.local
4. Try: backend-svc (upstream DNS)
# Debug DNS
kubectl exec -it test-pod -- nslookup backend-svc
kubectl exec -it test-pod -- cat /etc/resolv.conf
# Check CoreDNS is running
kubectl get pods -n kube-system -l k8s-app=kube-dns
# Check CoreDNS logs
kubectl logs -n kube-system -l k8s-app=kube-dns --tail=20
# Full FQDN (skips search chain, faster)
backend-svc.production.svc.cluster.local
Drill 2: Service Types¶
Difficulty: Easy
Q: Explain ClusterIP, NodePort, LoadBalancer, and ExternalName. When do you use each?
Answer
| Type | Access | Port | Use Case | |------|--------|------|----------| | **ClusterIP** | Internal only | Cluster IP:port | Default. Service-to-service. | | **NodePort** | External via node IP | NodeIP:30000-32767 | Dev, bare-metal without LB | | **LoadBalancer** | External via cloud LB | LB IP:port | Production external access | | **ExternalName** | DNS CNAME | N/A | Alias to external service | In production: use Ingress (L7 routing) in front of ClusterIP services, not LoadBalancer per service.Drill 3: NetworkPolicy¶
Difficulty: Medium
Q: Write a NetworkPolicy that allows the api pods to receive traffic only from frontend pods on port 8080, and blocks everything else.
Answer
Key rules: - If no NetworkPolicy selects a pod, all traffic is allowed (default allow) - Once any policy selects a pod, everything not explicitly allowed is denied - `policyTypes: [Ingress]` means only ingress is restricted; egress is still open - Add `policyTypes: [Ingress, Egress]` with egress rules for full lockdownDrill 4: Debug Connectivity¶
Difficulty: Medium
Q: Pod A can't reach Pod B on port 8080. Walk through the debugging steps.
Answer
# 1. Verify Pod B is running and has an IP
kubectl get pod pod-b -o wide
# 2. Check the Service endpoints
kubectl get endpoints svc-b
# Empty endpoints = label selector doesn't match pods
# 3. Test from Pod A
kubectl exec pod-a -- curl -v pod-b-svc:8080
kubectl exec pod-a -- nslookup pod-b-svc
# 4. Test direct pod IP (bypass Service)
kubectl exec pod-a -- curl -v <pod-b-ip>:8080
# 5. Check NetworkPolicy
kubectl get networkpolicy -n <ns>
kubectl describe networkpolicy -n <ns>
# 6. Check if the port is actually listening in Pod B
kubectl exec pod-b -- ss -tlnp | grep 8080
# 7. Check container logs
kubectl logs pod-b --tail=20
Drill 5: Ingress¶
Difficulty: Medium
Q: Write an Ingress that routes /api to api-svc:8080 and / to frontend-svc:80 with TLS.
Answer
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
ingressClassName: nginx
tls:
- hosts:
- app.example.com
secretName: app-tls
rules:
- host: app.example.com
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: api-svc
port:
number: 8080
- path: /
pathType: Prefix
backend:
service:
name: frontend-svc
port:
number: 80
Drill 6: TCP/IP Fundamentals¶
Difficulty: Easy
Q: A service is unreachable. You run curl -v and see "Connection refused" vs "Connection timed out." What's the difference?
Answer
**Connection refused** (RST): - The host is reachable but nothing is listening on that port - Got a TCP RST packet back - Debug: check if the process is running, check the port **Connection timed out**: - Packets are being dropped (no response at all) - Firewall, security group, NetworkPolicy, or bad routing - Debug: check firewall rules, security groups, NetworkPolicy, routing tables **Connection reset by peer**: - Connection was established then forcibly closed - Backend crashed, overloaded, or TLS mismatch - Debug: check backend logs, connection limitsDrill 7: CIDR and Subnetting¶
Difficulty: Medium
Q: How many usable IPs are in a /24? A /16? What CIDR would you use for a subnet with 500 hosts?
Answer
Common Kubernetes CIDR ranges: - Pod network: `10.244.0.0/16` (64K pods) - Service network: `10.96.0.0/12` (1M services) - Node network: `10.0.0.0/16` (VPC) Quick math: `2^(32-prefix) = total IPs` - `/24` → `2^8 = 256` - `/20` → `2^12 = 4096`Drill 8: kube-proxy Modes¶
Difficulty: Hard
Q: What are the kube-proxy modes and how do they affect Service routing?
Answer
| Mode | How | Performance | Features | |------|-----|-------------|----------| | **iptables** (default) | Writes iptables rules per Service/endpoint | Good for <1000 services | Random load balancing | | **IPVS** | Linux Virtual Server in kernel | Better for >1000 services | Round-robin, least-conn, etc. | | **nftables** | Modern iptables replacement | Similar to iptables | K8s 1.29+ | iptables mode with 10K+ services causes slow rule updates and high CPU. Switch to IPVS for large clusters.Drill 9: Default Deny NetworkPolicy¶
Difficulty: Easy
Q: Write a NetworkPolicy that blocks all ingress and egress for a namespace, then allow DNS egress.
Answer
# Block everything
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {} # Matches all pods
policyTypes:
- Ingress
- Egress
---
# Allow DNS (required for service discovery)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
namespace: production
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to: []
ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53
Drill 10: Headless Service¶
Difficulty: Medium
Q: What is a headless Service? When and why would you use one?
Answer
A headless Service has `clusterIP: None`. Instead of load-balancing to one pod, DNS returns **all pod IPs**. Use cases: - **StatefulSets**: each pod needs a stable DNS name (`postgres-0.postgres.ns.svc`) - **Client-side load balancing**: app picks which pod to connect to - **Service discovery**: client needs to know all endpointsWiki Navigation¶
Prerequisites¶
- Networking Deep Dive (Topic Pack, L1)
Related Content¶
- Networking Deep Dive (Topic Pack, L1) — DNS, Linux Networking Tools, TCP/IP
- Case Study: Duplex Mismatch Symptoms (Case Study, L1) — Linux Networking Tools, TCP/IP
- DHCP & IP Address Management (Topic Pack, L1) — DNS, TCP/IP
- Deep Dive: TCP/IP Deep Dive (deep_dive, L2) — Linux Networking Tools, TCP/IP
- Networking Troubleshooting (Topic Pack, L1) — Linux Networking Tools, TCP/IP
- Skillcheck: Networking Fundamentals (Assessment, L1) — DNS, TCP/IP
- AWS Networking (Topic Pack, L1) — TCP/IP
- AWS Route 53 (Topic Pack, L2) — DNS
- Adversarial Interview Gauntlet (30 sequences) (Scenario, L2) — TCP/IP
- Case Study: API Latency Spike — BGP Route Leak, Fix Is Network ACL (Case Study, L2) — Linux Networking Tools
Pages that link here¶
- ARP Flux / Duplicate IP
- DHCP & IP Address Management
- DHCP & IP Address Management - Primer
- DNS Deep Dive - Primer
- DNS Operations - Primer
- DNS Resolution Taking 5+ Seconds Intermittently
- DNS Split-Horizon Confusion
- Drills
- Duplex Mismatch
- NAT Port Exhaustion / Intermittent Failures
- Networking Deep Dive
- Networking Domain
- Networking Fundamentals
- Networking Troubleshooting
- Networking Troubleshooting - Primer