etcd — Trivia & Interesting Facts¶
Surprising, historical, and little-known facts about etcd.
etcd was created for CoreOS, not Kubernetes¶
Brandon Philips and the CoreOS team created etcd in 2013 as the configuration store for CoreOS, a lightweight Linux distribution for containers. When Kubernetes needed a reliable distributed key-value store, they chose etcd. This decision made etcd one of the most critical components in the cloud-native ecosystem, far beyond its original purpose.
The name "etcd" comes from the Unix /etc directory plus "distributed"¶
The name is a combination of the Unix /etc directory (which traditionally stores system configuration) and "d" for distributed. Pronounced "et-see-dee," it describes exactly what it does: distributed system configuration storage.
etcd uses the Raft consensus algorithm — chosen for understandability¶
etcd uses the Raft consensus protocol, created by Diego Ongaro and John Ousterhout at Stanford in 2013. Raft was explicitly designed to be more understandable than Paxos (the previous standard). The original paper's title was "In Search of an Understandable Consensus Algorithm," and the CoreOS team chose it partly because they could actually understand and debug it.
Every Kubernetes cluster depends on etcd, and most operators don't realize it¶
etcd stores all Kubernetes cluster state — every pod, service, configmap, and secret. If etcd is lost and there's no backup, the entire cluster state is gone. Despite this critical dependency, many Kubernetes operators have never directly interacted with etcd and don't have backup procedures for it.
etcd has a default storage limit of 2GB — and hitting it crashes the cluster¶
etcd's default maximum database size is 2GB (configurable up to 8GB). When this limit is reached, etcd stops accepting writes, which causes the Kubernetes API server to become read-only. This has bitten many teams who didn't monitor etcd's database size, especially in clusters with many Events or large ConfigMaps.
The etcd leader handles all writes — there is no write distribution¶
In an etcd cluster, all write operations are forwarded to the leader node. Followers can serve reads (with the --read-only flag or serializable reads), but every write must go through the leader. This means scaling an etcd cluster from 3 to 5 nodes doesn't improve write performance — it only improves fault tolerance.
Defragmentation is required or etcd's disk usage grows forever¶
etcd uses a copy-on-write B+ tree (bbolt) that doesn't reclaim space from deleted keys. Over time, the on-disk database grows even if the logical data size is constant. Regular defragmentation (etcdctl defrag) is required, and forgetting to do it is one of the most common etcd operational issues.
CoreOS (etcd's creator) was acquired by Red Hat for $250 million¶
Red Hat acquired CoreOS in January 2018 for approximately $250 million. The acquisition brought etcd, Container Linux, and the Operator Framework under Red Hat's umbrella. IBM subsequently acquired Red Hat for $34 billion in 2019, making etcd (indirectly) an IBM-stewarded project.
Running more than 7 etcd nodes actually hurts performance¶
The Raft consensus algorithm requires a majority of nodes to agree on every write. With 3 nodes, 2 must agree. With 7 nodes, 4 must agree — increasing latency. The recommended etcd cluster size is 3 or 5 nodes. Going beyond 5 is generally counterproductive unless read-heavy workloads justify it.
etcd v3 was a complete API rewrite, not a version bump¶
etcd v3, released in 2016, was a complete rewrite of the client API and storage engine, not an incremental update. It replaced the v2 REST API with gRPC, added support for multi-version concurrency control (MVCC), and introduced leases and watches. The v2 API was eventually deprecated, and Kubernetes only supports v3.