Skip to content

Comparison: CNI Plugins

Category: Networking Last meaningful update consideration: 2026-03 Verdict (opinionated): Cilium for new clusters — eBPF-based networking is the future and it comes with observability (Hubble) and policy enforcement built in. Calico for proven stability and the broadest deployment base. Flannel only for simple, small clusters.

Quick Decision Matrix

Factor Calico Cilium Flannel AWS VPC CNI
Learning curve Medium Medium-High Low Low (AWS-specific)
Operational overhead Low-Medium Medium Low Low (AWS-managed)
Cost at small scale Free Free Free Free (AWS node cost)
Cost at large scale Free + ops Free + ops Free + ops Free (but IP exhaustion risk)
Community/ecosystem Large (Tigera) Large (Isovalent/Cisco) Moderate AWS-only
Hiring Easy Growing rapidly Easy AWS engineers
Network policy Full (Calico + K8s) Full (Cilium + K8s) + L7 None Security groups for pods
Dataplane iptables / eBPF / Bird eBPF VXLAN overlay AWS ENI
Encryption WireGuard WireGuard / IPsec None VPC encryption
Observability Basic (Flow logs via Enterprise) Hubble (excellent) None VPC Flow Logs
Performance Excellent (eBPF mode) Excellent (eBPF native) Good (VXLAN overhead) Excellent (native VPC)
Service mesh None (separate) Built-in option None None
BGP support Yes (native) Yes No No
Multi-cluster Calico Federation ClusterMesh No VPC peering
IP management IPAM (Calico IPAM or host-local) IPAM (Cilium IPAM or host-local) host-local ENI-based (VPC IPs)

When to Pick Each

Pick Calico when:

  • You want the most battle-tested CNI with the longest production track record
  • BGP peering with physical network infrastructure is required (data center, bare metal)
  • You need network policy enforcement but do not need L7 (HTTP-aware) policies
  • Your nodes run older kernels where eBPF is not fully supported
  • You want multiple dataplane options: iptables (proven), eBPF (newer), or VPP
  • Enterprise support from Tigera is available if needed

Pick Cilium when:

  • You are building a new cluster and want the most modern networking stack
  • eBPF-based networking and security are priorities — kernel-level processing without iptables overhead
  • Hubble observability (network flow visualization, DNS monitoring, HTTP metrics) is valuable
  • You want optional service mesh capabilities without adding another component
  • L7-aware network policies (e.g., allow GET /api/v1/* but deny POST) are needed
  • You plan to use ClusterMesh for multi-cluster networking

Pick Flannel when:

  • You need the simplest possible CNI with minimal configuration
  • Your cluster is small (< 50 nodes) and network policy is not needed
  • You are running K3s, Kind, or a local development cluster
  • You want a VXLAN overlay that "just works" without tuning
  • You do NOT need network policy enforcement (Flannel does not support it)

Pick AWS VPC CNI when:

  • You are on EKS and want pods with native VPC IP addresses
  • Security groups for pods (per-pod network policy via AWS SGs) is your security model
  • Integration with AWS networking (ALB target groups in IP mode, PrivateLink) is important
  • You want the lowest latency — no overlay, no encapsulation, native VPC routing
  • You accept the IP address management constraints (limited IPs per ENI per instance type)

Nobody Tells You

Calico

  • Calico has three dataplanes: iptables (default, proven), eBPF (newer, requires kernel 5.3+), and VPP (niche, high performance). Most production deployments use iptables, not eBPF. If you want eBPF, Cilium is the more mature choice.
  • At scale (1000+ nodes), iptables rules become a performance bottleneck. Each NetworkPolicy adds iptables rules, and rule evaluation is linear. This is why Calico added eBPF mode.
  • Calico's BIRD BGP agent is powerful for peering with physical routers but is another component to debug. BGP sessions flapping cause route convergence delays that affect pod connectivity.
  • Calico's default IPAM can lead to IP address fragmentation across nodes. Nodes get IP blocks allocated and do not return unused IPs efficiently.
  • The Calico NetworkPolicy spec extends K8s NetworkPolicy with additional features (global policies, DNS-based rules, application-layer policies). This is convenient but creates Calico-specific manifests.
  • Calico Enterprise (Tigera) adds compliance reporting, threat detection, and a management UI. The OSS version is solid but lacks visibility tooling.

Cilium

  • Cilium requires Linux kernel 5.10+ for full eBPF functionality. On older kernels, features degrade or fail entirely. Verify your node kernel version before deployment.
  • Cilium's eBPF programs are loaded into the kernel. A bug in a Cilium eBPF program can affect kernel-level networking for all pods on a node. This is rare but the blast radius is larger than userspace proxies.
  • Hubble is Cilium's observability layer and it is genuinely excellent — real-time network flow visualization, DNS query monitoring, HTTP request/response metrics. But Hubble adds resource consumption and storage requirements.
  • Cilium upgrades require restarting the Cilium agent on each node, which briefly disrupts eBPF programs. Pods stay running but new connections may be delayed. Use a rolling upgrade strategy.
  • Cilium's identity-based security model (labels → identity → policy) is more scalable than IP-based policies but requires understanding a new abstraction. "Why was this connection denied?" requires understanding identity resolution.
  • The Isovalent acquisition by Cisco introduced enterprise features behind a license. Core Cilium remains open source, but advanced features (like Tetragon security observability, enterprise Hubble) are commercial.

Flannel

  • Flannel has no network policy support. Zero. If you need any pod-to-pod access control, you need a different CNI or a policy-only add-on like Calico (but that combination adds complexity).
  • VXLAN overlay adds encapsulation overhead (~50 bytes per packet). For latency-sensitive workloads, this matters.
  • Flannel is a mature, stable project that does one thing (overlay networking) well. It is not declining — it is just simple. For environments where that simplicity is valued, it is the right choice.
  • Flannel's lack of features is actually a feature for development and CI clusters. There is nothing to misconfigure.
  • Flannel does not provide any observability into network flows. You are blind to pod-to-pod traffic patterns.

AWS VPC CNI

  • The VPC CNI allocates ENI (Elastic Network Interface) secondary IPs to pods. Each EC2 instance type has a maximum ENI count and IPs per ENI. On small instances (t3.small: 3 ENIs, 4 IPs each = 12 pod IPs), you run out fast.
  • The WARM_IP_TARGET and MINIMUM_IP_TARGET settings control how many IPs are pre-allocated. Default behavior aggressively pre-allocates IPs, which can exhaust your subnet CIDR faster than expected.
  • VPC CNI's prefix delegation mode (assigning /28 prefixes instead of individual IPs) dramatically increases pod density per node but requires careful subnet planning.
  • Pods with VPC IPs are directly addressable from outside the cluster (within the VPC). This is convenient for ALB IP-mode targeting but means your pods are on the VPC network — no overlay isolation.
  • Custom networking (specifying different subnets for pods vs. nodes) works but the setup is complex and easy to misconfigure.
  • If your EKS nodes span multiple availability zones and subnets, pods in different AZs can communicate but traffic crosses AZ boundaries with associated costs.

Migration Pain Assessment

From → To Effort Risk Timeline
Flannel → Calico Medium Medium 1-2 days (cluster recreation)
Flannel → Cilium Medium Medium 1-2 days (cluster recreation)
Calico → Cilium High High 1-3 days (rolling or recreation)
Cilium → Calico High High 1-3 days (rolling or recreation)
VPC CNI → Cilium/Calico Very High High Full cluster rebuild
Any → VPC CNI Very High High Full cluster rebuild

CNI migration usually means cluster recreation. In-place CNI swap is technically possible but extremely risky — if networking breaks mid-migration, every pod loses connectivity. The safest path is standing up a new cluster with the target CNI and migrating workloads.

The Interview Answer

"For new clusters, I lean toward Cilium because eBPF-based networking is more efficient than iptables chains, and Hubble gives you network observability that no other CNI includes. Calico is the safe, proven choice with the longest track record and the broadest deployment base. The CNI decision is one of the most permanent choices you make — changing it later usually means rebuilding the cluster. So choose based on your long-term needs: if you want network policy, observability, and potentially service mesh from the CNI layer, Cilium. If you want proven stability and BGP integration, Calico. If you want simplicity for a dev cluster, Flannel."

Cross-References