Comparison: CNI Plugins¶

Category: Networking Last meaningful update consideration: 2026-03 Verdict (opinionated): Cilium for new clusters — eBPF-based networking is the future and it comes with observability (Hubble) and policy enforcement built in. Calico for proven stability and the broadest deployment base. Flannel only for simple, small clusters.

Quick Decision Matrix¶

Factor	Calico	Cilium	Flannel	AWS VPC CNI
Learning curve	Medium	Medium-High	Low	Low (AWS-specific)
Operational overhead	Low-Medium	Medium	Low	Low (AWS-managed)
Cost at small scale	Free	Free	Free	Free (AWS node cost)
Cost at large scale	Free + ops	Free + ops	Free + ops	Free (but IP exhaustion risk)
Community/ecosystem	Large (Tigera)	Large (Isovalent/Cisco)	Moderate	AWS-only
Hiring	Easy	Growing rapidly	Easy	AWS engineers
Network policy	Full (Calico + K8s)	Full (Cilium + K8s) + L7	None	Security groups for pods
Dataplane	iptables / eBPF / Bird	eBPF	VXLAN overlay	AWS ENI
Encryption	WireGuard	WireGuard / IPsec	None	VPC encryption
Observability	Basic (Flow logs via Enterprise)	Hubble (excellent)	None	VPC Flow Logs
Performance	Excellent (eBPF mode)	Excellent (eBPF native)	Good (VXLAN overhead)	Excellent (native VPC)
Service mesh	None (separate)	Built-in option	None	None
BGP support	Yes (native)	Yes	No	No
Multi-cluster	Calico Federation	ClusterMesh	No	VPC peering
IP management	IPAM (Calico IPAM or host-local)	IPAM (Cilium IPAM or host-local)	host-local	ENI-based (VPC IPs)

When to Pick Each¶

Pick Calico when:¶

You want the most battle-tested CNI with the longest production track record
BGP peering with physical network infrastructure is required (data center, bare metal)
You need network policy enforcement but do not need L7 (HTTP-aware) policies
Your nodes run older kernels where eBPF is not fully supported
You want multiple dataplane options: iptables (proven), eBPF (newer), or VPP
Enterprise support from Tigera is available if needed

Pick Cilium when:¶

You are building a new cluster and want the most modern networking stack
eBPF-based networking and security are priorities — kernel-level processing without iptables overhead
Hubble observability (network flow visualization, DNS monitoring, HTTP metrics) is valuable
You want optional service mesh capabilities without adding another component
L7-aware network policies (e.g., allow GET /api/v1/* but deny POST) are needed
You plan to use ClusterMesh for multi-cluster networking

Pick Flannel when:¶

You need the simplest possible CNI with minimal configuration
Your cluster is small (< 50 nodes) and network policy is not needed
You are running K3s, Kind, or a local development cluster
You want a VXLAN overlay that "just works" without tuning
You do NOT need network policy enforcement (Flannel does not support it)

Pick AWS VPC CNI when:¶

You are on EKS and want pods with native VPC IP addresses
Security groups for pods (per-pod network policy via AWS SGs) is your security model
Integration with AWS networking (ALB target groups in IP mode, PrivateLink) is important
You want the lowest latency — no overlay, no encapsulation, native VPC routing
You accept the IP address management constraints (limited IPs per ENI per instance type)

Nobody Tells You¶

Calico¶

Calico has three dataplanes: iptables (default, proven), eBPF (newer, requires kernel 5.3+), and VPP (niche, high performance). Most production deployments use iptables, not eBPF. If you want eBPF, Cilium is the more mature choice.
At scale (1000+ nodes), iptables rules become a performance bottleneck. Each NetworkPolicy adds iptables rules, and rule evaluation is linear. This is why Calico added eBPF mode.
Calico's BIRD BGP agent is powerful for peering with physical routers but is another component to debug. BGP sessions flapping cause route convergence delays that affect pod connectivity.
Calico's default IPAM can lead to IP address fragmentation across nodes. Nodes get IP blocks allocated and do not return unused IPs efficiently.
The Calico NetworkPolicy spec extends K8s NetworkPolicy with additional features (global policies, DNS-based rules, application-layer policies). This is convenient but creates Calico-specific manifests.
Calico Enterprise (Tigera) adds compliance reporting, threat detection, and a management UI. The OSS version is solid but lacks visibility tooling.

Cilium¶

Cilium requires Linux kernel 5.10+ for full eBPF functionality. On older kernels, features degrade or fail entirely. Verify your node kernel version before deployment.
Cilium's eBPF programs are loaded into the kernel. A bug in a Cilium eBPF program can affect kernel-level networking for all pods on a node. This is rare but the blast radius is larger than userspace proxies.
Hubble is Cilium's observability layer and it is genuinely excellent — real-time network flow visualization, DNS query monitoring, HTTP request/response metrics. But Hubble adds resource consumption and storage requirements.
Cilium upgrades require restarting the Cilium agent on each node, which briefly disrupts eBPF programs. Pods stay running but new connections may be delayed. Use a rolling upgrade strategy.
Cilium's identity-based security model (labels → identity → policy) is more scalable than IP-based policies but requires understanding a new abstraction. "Why was this connection denied?" requires understanding identity resolution.
The Isovalent acquisition by Cisco introduced enterprise features behind a license. Core Cilium remains open source, but advanced features (like Tetragon security observability, enterprise Hubble) are commercial.

Flannel¶

Flannel has no network policy support. Zero. If you need any pod-to-pod access control, you need a different CNI or a policy-only add-on like Calico (but that combination adds complexity).
VXLAN overlay adds encapsulation overhead (~50 bytes per packet). For latency-sensitive workloads, this matters.
Flannel is a mature, stable project that does one thing (overlay networking) well. It is not declining — it is just simple. For environments where that simplicity is valued, it is the right choice.
Flannel's lack of features is actually a feature for development and CI clusters. There is nothing to misconfigure.
Flannel does not provide any observability into network flows. You are blind to pod-to-pod traffic patterns.

AWS VPC CNI¶

The VPC CNI allocates ENI (Elastic Network Interface) secondary IPs to pods. Each EC2 instance type has a maximum ENI count and IPs per ENI. On small instances (t3.small: 3 ENIs, 4 IPs each = 12 pod IPs), you run out fast.
The WARM_IP_TARGET and MINIMUM_IP_TARGET settings control how many IPs are pre-allocated. Default behavior aggressively pre-allocates IPs, which can exhaust your subnet CIDR faster than expected.
VPC CNI's prefix delegation mode (assigning /28 prefixes instead of individual IPs) dramatically increases pod density per node but requires careful subnet planning.
Pods with VPC IPs are directly addressable from outside the cluster (within the VPC). This is convenient for ALB IP-mode targeting but means your pods are on the VPC network — no overlay isolation.
Custom networking (specifying different subnets for pods vs. nodes) works but the setup is complex and easy to misconfigure.
If your EKS nodes span multiple availability zones and subnets, pods in different AZs can communicate but traffic crosses AZ boundaries with associated costs.

Migration Pain Assessment¶

From → To	Effort	Risk	Timeline
Flannel → Calico	Medium	Medium	1-2 days (cluster recreation)
Flannel → Cilium	Medium	Medium	1-2 days (cluster recreation)
Calico → Cilium	High	High	1-3 days (rolling or recreation)
Cilium → Calico	High	High	1-3 days (rolling or recreation)
VPC CNI → Cilium/Calico	Very High	High	Full cluster rebuild
Any → VPC CNI	Very High	High	Full cluster rebuild

CNI migration usually means cluster recreation. In-place CNI swap is technically possible but extremely risky — if networking breaks mid-migration, every pod loses connectivity. The safest path is standing up a new cluster with the target CNI and migrating workloads.

The Interview Answer¶

"For new clusters, I lean toward Cilium because eBPF-based networking is more efficient than iptables chains, and Hubble gives you network observability that no other CNI includes. Calico is the safe, proven choice with the longest track record and the broadest deployment base. The CNI decision is one of the most permanent choices you make — changing it later usually means rebuilding the cluster. So choose based on your long-term needs: if you want network policy, observability, and potentially service mesh from the CNI layer, Cilium. If you want proven stability and BGP integration, Calico. If you want simplicity for a dev cluster, Flannel."

Cross-References¶

Topic Packs: Cilium, K8s Networking, eBPF Observability
Related Comparisons: Service Meshes, Ingress Controllers