Skip to content

Kubernetes Networking - Primer

Why This Matters

Networking is where Kubernetes complexity lives. Every production outage investigation eventually hits a networking layer — a pod that cannot reach a service, DNS resolution failing silently, a network policy blocking traffic nobody expected. Understanding the Kubernetes networking model is not optional for anyone operating clusters.

The Pod Networking Model

Kubernetes imposes three fundamental rules:

  1. Every pod gets its own IP address — no NAT between pods
  2. All pods can communicate with all other pods without NAT (unless NetworkPolicies restrict it)
  3. Agents on a node (kubelet, kube-proxy) can communicate with all pods on that node

This flat network model means any pod can reach any other pod by IP, regardless of which node they are on.

Under the hood: Kubernetes's "flat network" requirement (every pod can reach every other pod without NAT) was a deliberate design decision to avoid the port-mapping complexity of Docker's default bridge networking. The Kubernetes network model document (kubernetes.io/docs/concepts/cluster-administration/networking/) states this requirement explicitly. The CNI plugin is responsible for making it real — typically via VXLAN overlay tunnels, BGP route distribution, or eBPF. Containers within the same pod share the network namespace — they communicate over localhost.

# See all pod IPs across the cluster
kubectl get pods -A -o wide

# Confirm pod-to-pod connectivity
kubectl exec -it debug-pod -- ping 10.244.1.15

# Show network interfaces inside a pod
kubectl exec -it nginx-7d8f5 -- ip addr show

CNI Plugins

The Container Network Interface (CNI) standard delegates network setup to plugins. The kubelet calls the CNI plugin when a pod is created or destroyed.

Calico — BGP-based routing (overlay or native). Provides NetworkPolicy enforcement.

Cilium — eBPF-based networking, load balancing, and security. Can replace kube-proxy.

Flannel — Simplest option, VXLAN overlay only. No NetworkPolicy support — pair with Calico.

# Check CNI plugin pods
kubectl get pods -n kube-system -l k8s-app=calico-node         # Calico
kubectl -n kube-system exec ds/cilium -- cilium status          # Cilium
kubectl get pods -n kube-system -l app=flannel                  # Flannel

Services

Services provide stable endpoints for ephemeral pods. Four types:

ClusterIP (default) — internal only

apiVersion: v1
kind: Service
metadata:
  name: backend-api
spec:
  type: ClusterIP
  selector:
    app: backend
  ports:
    - port: 80
      targetPort: 8080
kubectl get endpoints backend-api
kubectl exec -it debug-pod -- curl http://backend-api.production.svc.cluster.local

NodePort — exposes on every node (30000-32767)

apiVersion: v1
kind: Service
metadata:
  name: frontend
spec:
  type: NodePort
  selector:
    app: frontend
  ports:
    - port: 80
      targetPort: 3000
      nodePort: 31080

LoadBalancer — provisions external LB via cloud provider

apiVersion: v1
kind: Service
metadata:
  name: public-api
spec:
  type: LoadBalancer
  selector: { app: api }
  ports:
    - port: 443
      targetPort: 8443

ExternalName — DNS CNAME redirect, no proxying

# No proxying — just a DNS alias
apiVersion: v1
kind: Service
metadata:
  name: external-db
spec:
  type: ExternalName
  externalName: mydb.us-east-1.rds.amazonaws.com

kube-proxy: iptables vs IPVS

kube-proxy runs on every node and programs packet forwarding rules for Services.

Name origin: CNI (Container Network Interface) was created by CoreOS in 2015 as a simpler alternative to Docker's libnetwork/CNM (Container Network Model). The CNCF adopted CNI as its standard. A CNI plugin is just an executable that takes a JSON config on stdin and sets up networking for a container — the simplicity of the interface is what enabled the ecosystem of 30+ plugins.

iptables mode (default): writes DNAT rules that forward ClusterIP traffic to backend pods. Rule updates are O(n) — at ~5,000+ Services, sync latency becomes a problem.

iptables -t nat -L KUBE-SERVICES -n | grep backend-api
kubectl logs -n kube-system -l k8s-app=kube-proxy --tail=50

IPVS mode: uses kernel hash tables for O(1) lookups. Better scalability and multiple load-balancing algorithms (rr, lc, wrr, sh).

# Check current mode
kubectl get configmap kube-proxy -n kube-system -o yaml | grep mode

# View IPVS virtual servers
ipvsadm -Ln

Switch to IPVS by editing the kube-proxy ConfigMap (mode: "ipvs") and restarting:

kubectl rollout restart daemonset kube-proxy -n kube-system

DNS: CoreDNS and Service Discovery

CoreDNS runs as a Deployment in kube-system and serves DNS records for all Services.

Record Format Example
Service A <svc>.<ns>.svc.cluster.local backend-api.production.svc.cluster.local
Headless Returns pod IPs directly pod-ip.svc.ns.svc.cluster.local
SRV _<port>._<proto>.<svc>.<ns>.svc.cluster.local Port discovery

Headless Services

A headless service (clusterIP: None) returns individual pod IPs via DNS. Essential for StatefulSets.

apiVersion: v1
kind: Service
metadata:
  name: cassandra
spec:
  clusterIP: None
  selector:
    app: cassandra
  ports:
    - port: 9042
# Returns all pod IPs, not a virtual IP
kubectl exec -it debug-pod -- nslookup cassandra.default.svc.cluster.local
# StatefulSet pods are individually addressable: cassandra-0.cassandra.default.svc.cluster.local

DNS troubleshooting

kubectl get pods -n kube-system -l k8s-app=kube-dns          # CoreDNS running?
kubectl logs -n kube-system -l k8s-app=kube-dns --tail=50    # Resolution errors?
kubectl exec -it debug-pod -- cat /etc/resolv.conf            # Correct nameserver?
kubectl exec -it debug-pod -- nslookup kubernetes.default     # Basic resolution works?

NetworkPolicies

By default, all pod-to-pod traffic is allowed. NetworkPolicies define firewall rules using label selectors.

Default deny all ingress

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-ingress
  namespace: production
spec:
  podSelector: {}
  policyTypes:
    - Ingress

Allow frontend to backend on port 8080

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend-to-backend
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
    - Ingress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: frontend
      ports:
        - protocol: TCP
          port: 8080

Egress with DNS allowed

Default trap: NetworkPolicies are additive — there is no "deny" rule. An empty podSelector: {} with a policyTypes: [Ingress] blocks ALL ingress to the namespace. But if you don't apply any NetworkPolicy at all, ALL traffic is allowed. The mental model: no policy = allow all, any policy = deny all except what's explicitly permitted. This catches people who expect a default-deny baseline.

Gotcha: NetworkPolicies only work if your CNI plugin supports them. Flannel does not implement NetworkPolicy. If you apply a NetworkPolicy on a Flannel cluster, it is silently ignored — no error, no warning, no enforcement. Calico, Cilium, and Weave all support NetworkPolicies.

When restricting egress, you must explicitly allow DNS (UDP/TCP 53) or service discovery breaks silently:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: backend-egress
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
    - Egress
  egress:
    - ports:
        - protocol: UDP
          port: 53
    - to:
        - ipBlock:
            cidr: 0.0.0.0/0
            except: [10.0.0.0/8]
      ports:
        - protocol: TCP
          port: 443

Ingress Controllers

Ingress resources define HTTP/HTTPS routing rules. A controller (NGINX, Traefik, etc.) implements them.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: app-ingress
spec:
  ingressClassName: nginx
  rules:
    - host: app.example.com
      http:
        paths:
          - path: /api
            pathType: Prefix
            backend:
              service:
                name: backend-api
                port:
                  number: 80
kubectl get ingress -A
kubectl logs -n ingress-nginx deploy/ingress-nginx-controller --tail=50

Debugging Networking

# 1. Pod basics
kubectl exec -it problem-pod -- ip addr show
kubectl exec -it problem-pod -- ip route

# 2. DNS resolution
kubectl exec -it problem-pod -- nslookup kubernetes.default
kubectl exec -it problem-pod -- cat /etc/resolv.conf

# 3. Service connectivity
kubectl get endpoints target-service
kubectl exec -it problem-pod -- curl -v --connect-timeout 5 http://target-service:80

# 4. Packet capture — ephemeral debug container (K8s 1.23+)
kubectl debug -it problem-pod --image=nicolaka/netshoot --target=app-container -- tcpdump -i eth0 -nn port 8080

Quick reference

Symptom Likely cause Check
Pod cannot reach other pods CNI plugin issue CNI pods in kube-system
Service DNS not resolving CoreDNS down kubectl get pods -l k8s-app=kube-dns -n kube-system
Reachable by IP not DNS Search domain issue /etc/resolv.conf in the pod
Blocked after policy apply Policy too restrictive Check if DNS egress is allowed

Wiki Navigation