Portal | Level: L2 | Domain: DevOps

Message Queues — Trivia & Interesting Facts¶

1. AMQP was born from banking politics. The Advanced Message Queuing Protocol (AMQP) was created in 2003 by JPMorgan Chase's John O'Hara, motivated by frustration with proprietary, expensive messaging middleware. Financial institutions needed a standard open wire protocol so they could interoperate without vendor lock-in. The 0-9-1 version of AMQP became the protocol RabbitMQ implements. AMQP 1.0 (a complete redesign) was standardized by OASIS in 2012 and is used by Azure Service Bus and Apache Qpid — but is deliberately incompatible with 0-9-1.

2. Kafka is named after Franz Kafka — intentionally, and somewhat grimly. Jay Kreps, one of Kafka's co-creators at LinkedIn, named the system after the author Franz Kafka because "it is a system optimized for writing." In a later interview, Kreps acknowledged the name was chosen because Kafka had some relation to writing, and the literary namesake's work (bureaucratic labyrinths, inexplicable processes, overwhelming systems) is arguably a fitting metaphor for enterprise messaging infrastructure.

3. Kafka was built to solve LinkedIn's own data pipeline collapse. By 2010, LinkedIn's activity data pipeline was a patchwork of point-to-point integrations between dozens of services, creating a maintenance nightmare. Jay Kreps, Neha Narkhede, and Jun Rao designed Kafka internally as a unified log that all services could write to and read from. LinkedIn open-sourced it in 2011 and donated it to the Apache Software Foundation. The original goal was not "message queue" but "activity stream" infrastructure — the realization that it could replace queues came later.

4. RabbitMQ chose Erlang for the same reason telecom companies did. Ericsson designed Erlang in the late 1980s specifically for building fault-tolerant, concurrent telecommunications switching systems. Erlang processes are extremely lightweight (thousands per millisecond), isolated (no shared memory), and supervised (a crashed process is restarted automatically by its supervisor). When Alexis Richardson and team at Rabbit Technologies built RabbitMQ in 2007, they chose Erlang because those exact properties — lightweight concurrency, isolation, supervision trees — are exactly what a high-availability message broker needs. The bet paid off: RabbitMQ nodes regularly handle hundreds of thousands of concurrent connections on commodity hardware.

5. ZeroMQ is not a message broker — it has no broker at all. Despite the name, ZeroMQ (ØMQ) deliberately has no message broker. It is a socket library that builds messaging patterns (pub-sub, push-pull, req-rep) directly into the transport. The "zero" in the name stands for zero broker, zero latency overhead, and zero cost. Messages go peer-to-peer. The philosophy, articulated by creator Pieter Hintjens, was that brokers are a single point of failure and a bottleneck — the network itself should be the broker. ZeroMQ is used heavily in scientific computing (Jupyter uses it for kernel communication), trading systems, and anywhere nanosecond latency matters more than durability.

6. NATS was written in a weekend and fits in a single Go file. The original version of NATS (Neural Autonomic Transport System), created by Derek Collison at Apcera in 2011, was written in Ruby over a weekend. The Go rewrite (gnatsd) is famously small: the core server is a few thousand lines of Go with zero external dependencies. NATS's design philosophy is "do less, do it fast" — it does not guarantee delivery by default (fire-and-forget), has no persistence in core NATS, and a single NATS server can handle millions of messages per second. NATS JetStream (added 2021) adds persistence and at-least-once delivery for those who need it.

7. Kafka log compaction was directly inspired by database write-ahead logs. The WAL (write-ahead log) in PostgreSQL, MySQL, and other databases is an append-only log of every change, used for crash recovery and replication. Jay Kreps realized that a database's WAL is essentially a log-compacted event stream: it records every mutation, and a snapshot at any point can be reconstructed by replaying from a checkpoint. Kafka's log compaction feature (introduced in 0.8.1) formalized this insight: retain only the latest value per key, just as a database's state is the result of applying all WAL entries. This makes Kafka suitable for state bootstrapping — consumers can replay a compacted topic to reconstruct current state without storing a full event history.

8. Amazon SQS was one of the first AWS services — predating EC2. Amazon Simple Queue Service launched in beta in November 2004, making it one of the earliest AWS services. It predated EC2 (2006) and S3 (2006) in internal use at Amazon. SQS was born from Amazon's own internal need to decouple the many services in their monolith-to-microservices transition. The distributed queue design — no single broker, messages stored redundantly across multiple servers — was a direct response to the reliability problems Amazon experienced with their internal messaging infrastructure in the early 2000s. SQS remains one of the most operationally simple message queues available: no servers to provision, automatic scaling, and at-least-once delivery with a visibility timeout model.

9. Redis Streams were added specifically because LPUSH/BRPOP wasn't good enough. Redis had been used as a makeshift message queue via LPUSH/BRPOP (push to list, block-pop from list) for years. It worked but had a fatal flaw: once a message was popped, it was gone. There was no consumer group concept, no replay, no acknowledgment. Salvatore Sanfilippo (antirez), Redis's creator, added Redis Streams in Redis 5.0 (2018) as a proper first-class streaming data structure modeled on Kafka's log. Streams support consumer groups, pending message tracking, acknowledgment, and XREADGROUP for cooperative consumption. They remain stored in memory (with optional AOF/RDB persistence), making them suitable for high-throughput, low-latency workloads where brief message loss on hard crash is acceptable.

10. Apache Pulsar was built at Yahoo to handle multi-tenancy across geographies. Yahoo built Pulsar internally starting around 2012 to replace their existing messaging infrastructure, which could not serve multiple teams (tenants) with strong isolation and could not span geographic regions without complex custom sharding. Pulsar's distinguishing architectural decision is the separation of serving (brokers) from storage (Apache BookKeeper). Brokers are stateless — they can be swapped out without data migration. BookKeeper handles durable storage in a distributed ledger. This architecture enables true multi-tenancy with namespace-level isolation, geographic replication built in from the start, and independent scaling of compute and storage. Yahoo open-sourced Pulsar in 2016; it became an Apache top-level project in 2018.

Primer: - Message Queues Primer

Footguns: - Message Queues Footguns

Street Ops: - Message Queues Street Ops

Message Queues — Trivia & Interesting Facts¶

Related Resources¶

Pages that link here¶