Mental Model: Idempotency¶
Category: Architecture & Design Origin: Mathematics (Peirce, 1867); applied to distributed systems design throughout the 1970s–2000s; formalized in REST API design by Roy Fielding (2000) One-liner: An operation is idempotent when applying it multiple times produces the same result as applying it once — making it safe to retry without side effects.
The Model¶
Idempotency is a mathematical property: f(f(x)) = f(x). In software, it means that an operation can be executed repeatedly, and after the first execution, subsequent executions produce no additional effect. Setting a light switch to "on" is idempotent: whether you flip it once or a hundred times, the result is the light is on. Incrementing a counter is not idempotent: each increment adds one. Deleting a record by ID is idempotent: after the first deletion, subsequent deletions of the same ID have no effect (the record is already gone).
The reason idempotency matters in distributed systems is deceptively simple: networks fail partway through. A client sends a request to create a payment. The server processes the payment and charges the card. The server then tries to send the response, but the network drops the packet. From the client's perspective, the request timed out — it doesn't know whether the payment was processed. Should it retry? If the create-payment operation is not idempotent, retrying creates a duplicate charge. If it is idempotent, retrying is safe — the charge happens at most once.
The mechanism for making operations idempotent is the idempotency key: a client-generated identifier (typically a UUID) that the server uses to deduplicate requests. The server stores a mapping of idempotency_key → result. When a request arrives with a key the server has already processed, the server returns the stored result without re-executing the operation. The key must be stable across retries — the client generates it once and reuses it. Stripe's API is the canonical example: every write operation accepts an Idempotency-Key header, making the entire payment API safely retryable.
HTTP methods have defined idempotency semantics: GET, PUT, DELETE, and HEAD are idempotent by specification. POST is explicitly not idempotent, which is why REST conventions use PUT for create-or-update operations where idempotency is required. This distinction matters in practice: a load balancer can safely retry a GET or PUT that times out; retrying a POST requires idempotency keys at the application level.
In infrastructure automation, idempotency is the defining property of good configuration management. Ansible playbooks, Terraform plans, and Kubernetes manifests are designed to be idempotent: running terraform apply ten times against a stable codebase makes ten API calls to verify that the desired state is already met, and zero API calls to change anything. This property is what makes GitOps workflows safe: the reconciliation loop runs continuously, applying desired state, and idempotency ensures it doesn't accumulate side effects.
The failure mode to avoid is false idempotency: an operation that looks idempotent but isn't. DELETE /users/123 appears idempotent — but if user 123 is re-created with the same ID between two delete calls, the second call deletes a different user. True idempotency requires careful consideration of what "the same result" means in context, including the conditions under which the operation was originally intended to execute.
Visual¶
WITHOUT IDEMPOTENCY (POST /payments):
Payment Service
Client Network ┌──────────────────────┐
│ │ │
├─── POST /payments ─────────────────►│ Process payment $50 │
│ ...timeout... │ Charge card ✓ │
│◄── [timeout, no response] ──────────│ (response lost) │
│ └──────────────────────┘
│
├─── POST /payments (retry) ──────────► Charge card AGAIN $50
│ 💸 Duplicate charge
WITH IDEMPOTENCY KEY:
Payment Service
Client Network ┌──────────────────────────────┐
│ │ │
│ Generate key: "key-uuid-abc" │ DB: idempotency_keys │
│ │ ┌────────────────────────┐ │
├─ POST /payments ───────────────────►│ │ key-uuid-abc → [empty] │ │
│ Idempotency-Key: key-uuid-abc │ └────────────────────────┘ │
│ ...timeout... │ Process payment $50 │
│◄── [timeout] ────────────────────── │ Charge card ✓ │
│ │ Store result for key-uuid-abc│
│ └──────────────────────────────┘
│
├─ POST /payments (retry) ───────────► Look up key-uuid-abc → found!
│ Idempotency-Key: key-uuid-abc Return stored result
│◄── 200 OK {charge_id: "ch_xyz"} ─── No duplicate charge ✓
HTTP METHOD IDEMPOTENCY:
┌───────────────┬────────────┬────────────────────────────────────┐
│ Method │ Idempotent │ Notes │
├───────────────┼────────────┼────────────────────────────────────┤
│ GET │ Yes │ Read-only, safe to retry │
│ HEAD │ Yes │ Same as GET, no body │
│ PUT │ Yes │ Set resource to exact state │
│ DELETE │ Yes │ Resource absent after operation │
│ POST │ No (by def)│ Requires app-level idempotency key │
│ PATCH │ Sometimes │ Depends on patch semantics │
└───────────────┴────────────┴────────────────────────────────────┘
TERRAFORM / ANSIBLE IDEMPOTENCY:
desired state ──► reconcile ──► actual state
│
if actual == desired: no-op (0 changes)
if actual != desired: apply delta (targeted change)
Run 1: 3 resources created
Run 2: 0 changes (idempotent)
Run 3: 0 changes (idempotent)
Run 4: config changed → 1 resource updated, 0 created/destroyed
When to Reach for This¶
- Any write operation that crosses a network boundary and may time out — payment processing, order creation, user registration, notification sending
- Building retry logic: retries are only safe on idempotent operations; always pair retry logic with idempotency guarantees
- Infrastructure automation with Ansible, Terraform, or Kubernetes — all three assume idempotent operations as a foundation for their reconciliation models
- Message queue consumers in at-least-once delivery systems (Kafka, SQS, RabbitMQ) — messages may be delivered more than once; consumers must process them idempotently
- Database migrations that may be run multiple times in CI pipelines —
CREATE TABLE IF NOT EXISTSis idempotent;CREATE TABLEis not - Any system where you cannot guarantee exactly-once delivery (which is almost all distributed systems)
When NOT to Use This¶
- Operations that are inherently non-idempotent by domain semantics — "charge the customer $10 each time they click" is intentionally non-idempotent; forcing idempotency here would be incorrect behavior
- Using idempotency keys with extremely short TTLs — if the idempotency key expires before the client's retry window, the deduplication guarantee disappears; key storage must outlast the maximum expected retry interval
- Treating
PUTas automatically idempotent without thinking about concurrent writes — two clients PUTting different values simultaneously may produce different results depending on ordering; idempotency prevents duplicate execution, not concurrent conflict - Implementing idempotency at the application level when the infrastructure already provides it — Kafka's exactly-once semantics, database transactions, or conditional writes may make application-level idempotency keys redundant
Applied Examples¶
Example 1: Stripe-style idempotency keys for a payment API¶
Implementing idempotency key handling in a FastAPI payment service:
import uuid
from datetime import datetime, timedelta
from fastapi import FastAPI, Header, HTTPException
from sqlalchemy.orm import Session
app = FastAPI()
class IdempotencyStore:
"""Stores idempotency keys and their results for deduplication."""
def get(self, key: str, db: Session) -> dict | None:
record = db.query(IdempotencyKey).filter_by(key=key).first()
if record and record.expires_at > datetime.now(datetime.UTC):
return record.response_body
return None
def store(self, key: str, response: dict, db: Session, ttl_hours: int = 24):
record = IdempotencyKey(
key=key,
response_body=response,
created_at=datetime.now(datetime.UTC),
expires_at=datetime.now(datetime.UTC) + timedelta(hours=ttl_hours),
)
db.add(record)
db.commit()
idempotency_store = IdempotencyStore()
@app.post("/payments")
async def create_payment(
payment: PaymentRequest,
idempotency_key: str | None = Header(None, alias="Idempotency-Key"),
db: Session = Depends(get_db),
):
if not idempotency_key:
raise HTTPException(400, "Idempotency-Key header required for payment operations")
# Check for existing result
existing = idempotency_store.get(idempotency_key, db)
if existing:
return existing # Return stored result, no charge
# Process payment (this is the only time the card is charged)
charge = payment_gateway.charge(
amount=payment.amount_cents,
currency=payment.currency,
card_token=payment.card_token,
)
response = {
"charge_id": charge.id,
"status": "succeeded",
"amount": payment.amount_cents,
}
# Store result before returning (so retries get the same response)
idempotency_store.store(idempotency_key, response, db)
return response
The client generates a UUID before the first attempt and reuses it on every retry:
import uuid, requests, time
idempotency_key = str(uuid.uuid4()) # Generated once, never regenerated
for attempt in range(5):
try:
response = requests.post(
"https://api.example.com/payments",
json={"amount_cents": 5000, "currency": "USD", "card_token": "tok_xyz"},
headers={"Idempotency-Key": idempotency_key},
timeout=10,
)
response.raise_for_status()
print(f"Payment succeeded: {response.json()['charge_id']}")
break
except (requests.Timeout, requests.ConnectionError) as e:
if attempt < 4:
time.sleep(2 ** attempt) # exponential backoff
else:
raise
Example 2: Idempotent Ansible tasks and Terraform resources¶
Ansible's design philosophy is that every task must be idempotent — running the playbook twice should be equivalent to running it once:
# IDEMPOTENT: creates the user if absent, no-op if already exists
- name: Create deploy user
ansible.builtin.user:
name: deploy
shell: /bin/bash
state: present
create_home: yes
# IDEMPOTENT: sets authorized key; using exclusive=no means it won't
# remove other keys; the task detects no change on second run
- name: Add SSH key for deploy user
ansible.posix.authorized_key:
user: deploy
key: "{{ lookup('file', 'files/deploy_rsa.pub') }}"
state: present
# NOT IDEMPOTENT (common mistake): shell command runs every time
- name: Initialize database (BAD — runs on every playbook execution)
ansible.builtin.shell: psql -c "INSERT INTO config VALUES ('initialized', 'true')"
# IDEMPOTENT: uses creates= to skip if marker file exists
- name: Initialize database (GOOD)
ansible.builtin.shell: >
psql -c "INSERT INTO config VALUES ('initialized', 'true')"
&& touch /var/lib/app/.db-initialized
args:
creates: /var/lib/app/.db-initialized
Terraform's aws_s3_bucket resource is inherently idempotent — Terraform stores the bucket name in state and uses if-not-exists semantics on apply:
resource "aws_s3_bucket" "artifacts" {
bucket = "mycompany-build-artifacts"
# Run terraform apply 100 times — creates on first run, no-op on all others
}
Terraform's state file tracks what exists; the plan computes the diff; apply executes only the delta. Idempotency is the property that makes terraform apply safe to run in a CI pipeline on every commit.
The Junior vs Senior Gap¶
| Junior | Senior |
|---|---|
| Adds retry logic without idempotency guarantees; creates duplicate records or double-charges | Designs idempotency into the API contract before writing retry logic |
Uses POST for all write operations; retries create duplicates |
Uses PUT for create-or-replace operations; reserves POST for operations that generate server-side IDs with application-level idempotency keys |
Treats Ansible shell tasks as idempotent because "it's just a script" |
Uses creates:, when:, or native idempotent modules; avoids shell and command for operations with side effects |
| Implements exactly-once processing in a message consumer without understanding the infrastructure's delivery guarantees | Designs message consumers to be idempotent first; treats exactly-once as a nice-to-have optimization, not a correctness dependency |
| Sets idempotency key TTL to 5 minutes; retries within 6 minutes fail silently | Sets TTL based on maximum retry window plus safety margin; monitors key expiry and retry rates |
| Considers idempotency an optimization | Considers idempotency a correctness requirement for any system with retries or at-least-once delivery |
Connections¶
- Complements: Circuit Breaker (use together for — circuit breakers determine when to retry; idempotency makes it safe to do so; without both, retries either never happen (no circuit breaker) or cause duplicate side effects (no idempotency))
- Complements: Event Sourcing (use together for — event consumers in at-least-once delivery systems must apply events idempotently; the event store's offset tracking enables consumers to detect and skip duplicate deliveries; idempotency is not optional in event-sourced architectures)
- Tensions: Bulkhead (contradicts when — bulkheads that drop requests rather than queuing them may cause clients to retry dropped requests; if the bulkhead sheds load by rejecting requests with HTTP 429 and the client retries immediately, the bulkhead's load-shedding is negated; idempotent retries must be paired with client-side backoff to respect bulkhead pressure)
- Topic Packs: distributed-systems, ansible, terraform
- Case Studies: firmware-update-boot-loop (non-idempotent firmware update scripts that re-apply updates on each boot, triggering a loop; an idempotent check — "is version X already installed?" — would have prevented the loop)