Skip to content

Mental Model: Idempotency

Category: Architecture & Design Origin: Mathematics (Peirce, 1867); applied to distributed systems design throughout the 1970s–2000s; formalized in REST API design by Roy Fielding (2000) One-liner: An operation is idempotent when applying it multiple times produces the same result as applying it once — making it safe to retry without side effects.

The Model

Idempotency is a mathematical property: f(f(x)) = f(x). In software, it means that an operation can be executed repeatedly, and after the first execution, subsequent executions produce no additional effect. Setting a light switch to "on" is idempotent: whether you flip it once or a hundred times, the result is the light is on. Incrementing a counter is not idempotent: each increment adds one. Deleting a record by ID is idempotent: after the first deletion, subsequent deletions of the same ID have no effect (the record is already gone).

The reason idempotency matters in distributed systems is deceptively simple: networks fail partway through. A client sends a request to create a payment. The server processes the payment and charges the card. The server then tries to send the response, but the network drops the packet. From the client's perspective, the request timed out — it doesn't know whether the payment was processed. Should it retry? If the create-payment operation is not idempotent, retrying creates a duplicate charge. If it is idempotent, retrying is safe — the charge happens at most once.

The mechanism for making operations idempotent is the idempotency key: a client-generated identifier (typically a UUID) that the server uses to deduplicate requests. The server stores a mapping of idempotency_key → result. When a request arrives with a key the server has already processed, the server returns the stored result without re-executing the operation. The key must be stable across retries — the client generates it once and reuses it. Stripe's API is the canonical example: every write operation accepts an Idempotency-Key header, making the entire payment API safely retryable.

HTTP methods have defined idempotency semantics: GET, PUT, DELETE, and HEAD are idempotent by specification. POST is explicitly not idempotent, which is why REST conventions use PUT for create-or-update operations where idempotency is required. This distinction matters in practice: a load balancer can safely retry a GET or PUT that times out; retrying a POST requires idempotency keys at the application level.

In infrastructure automation, idempotency is the defining property of good configuration management. Ansible playbooks, Terraform plans, and Kubernetes manifests are designed to be idempotent: running terraform apply ten times against a stable codebase makes ten API calls to verify that the desired state is already met, and zero API calls to change anything. This property is what makes GitOps workflows safe: the reconciliation loop runs continuously, applying desired state, and idempotency ensures it doesn't accumulate side effects.

The failure mode to avoid is false idempotency: an operation that looks idempotent but isn't. DELETE /users/123 appears idempotent — but if user 123 is re-created with the same ID between two delete calls, the second call deletes a different user. True idempotency requires careful consideration of what "the same result" means in context, including the conditions under which the operation was originally intended to execute.

Visual

WITHOUT IDEMPOTENCY (POST /payments):
                                         Payment Service
Client          Network                  ┌──────────────────────┐
                                                                ├─── POST /payments ─────────────────►│ Process payment $50                     ...timeout...         Charge card           │◄── [timeout, no response] ──────────│ (response lost)                                              └──────────────────────┘
    ├─── POST /payments (retry) ──────────► Charge card AGAIN $50
                                          💸 Duplicate charge

WITH IDEMPOTENCY KEY:
                                         Payment Service
Client          Network                  ┌──────────────────────────────┐
                                                                          Generate key: "key-uuid-abc"          DB: idempotency_keys                                                  ┌────────────────────────┐    ├─ POST /payments ───────────────────►│   key-uuid-abc  [empty]       Idempotency-Key: key-uuid-abc         └────────────────────────┘                      ...timeout...         Process payment $50           │◄── [timeout] ──────────────────────   Charge card                                                          Store result for key-uuid-abc│
                                        └──────────────────────────────┘
    ├─ POST /payments (retry) ───────────► Look up key-uuid-abc  found!
    Idempotency-Key: key-uuid-abc        Return stored result
  │◄── 200 OK {charge_id: "ch_xyz"} ───  No duplicate charge 
HTTP METHOD IDEMPOTENCY:
┌───────────────┬────────────┬────────────────────────────────────┐
  Method        Idempotent  Notes                              ├───────────────┼────────────┼────────────────────────────────────┤
  GET           Yes         Read-only, safe to retry             HEAD          Yes         Same as GET, no body                 PUT           Yes         Set resource to exact state          DELETE        Yes         Resource absent after operation      POST          No (by def) Requires app-level idempotency key   PATCH         Sometimes   Depends on patch semantics         └───────────────┴────────────┴────────────────────────────────────┘

TERRAFORM / ANSIBLE IDEMPOTENCY:
  desired state ──► reconcile ──► actual state
                                      if actual == desired: no-op (0 changes)
              if actual != desired: apply delta (targeted change)

  Run 1: 3 resources created
  Run 2: 0 changes (idempotent)
  Run 3: 0 changes (idempotent)
  Run 4: config changed  1 resource updated, 0 created/destroyed

When to Reach for This

  • Any write operation that crosses a network boundary and may time out — payment processing, order creation, user registration, notification sending
  • Building retry logic: retries are only safe on idempotent operations; always pair retry logic with idempotency guarantees
  • Infrastructure automation with Ansible, Terraform, or Kubernetes — all three assume idempotent operations as a foundation for their reconciliation models
  • Message queue consumers in at-least-once delivery systems (Kafka, SQS, RabbitMQ) — messages may be delivered more than once; consumers must process them idempotently
  • Database migrations that may be run multiple times in CI pipelines — CREATE TABLE IF NOT EXISTS is idempotent; CREATE TABLE is not
  • Any system where you cannot guarantee exactly-once delivery (which is almost all distributed systems)

When NOT to Use This

  • Operations that are inherently non-idempotent by domain semantics — "charge the customer $10 each time they click" is intentionally non-idempotent; forcing idempotency here would be incorrect behavior
  • Using idempotency keys with extremely short TTLs — if the idempotency key expires before the client's retry window, the deduplication guarantee disappears; key storage must outlast the maximum expected retry interval
  • Treating PUT as automatically idempotent without thinking about concurrent writes — two clients PUTting different values simultaneously may produce different results depending on ordering; idempotency prevents duplicate execution, not concurrent conflict
  • Implementing idempotency at the application level when the infrastructure already provides it — Kafka's exactly-once semantics, database transactions, or conditional writes may make application-level idempotency keys redundant

Applied Examples

Example 1: Stripe-style idempotency keys for a payment API

Implementing idempotency key handling in a FastAPI payment service:

import uuid
from datetime import datetime, timedelta
from fastapi import FastAPI, Header, HTTPException
from sqlalchemy.orm import Session

app = FastAPI()

class IdempotencyStore:
    """Stores idempotency keys and their results for deduplication."""

    def get(self, key: str, db: Session) -> dict | None:
        record = db.query(IdempotencyKey).filter_by(key=key).first()
        if record and record.expires_at > datetime.now(datetime.UTC):
            return record.response_body
        return None

    def store(self, key: str, response: dict, db: Session, ttl_hours: int = 24):
        record = IdempotencyKey(
            key=key,
            response_body=response,
            created_at=datetime.now(datetime.UTC),
            expires_at=datetime.now(datetime.UTC) + timedelta(hours=ttl_hours),
        )
        db.add(record)
        db.commit()

idempotency_store = IdempotencyStore()

@app.post("/payments")
async def create_payment(
    payment: PaymentRequest,
    idempotency_key: str | None = Header(None, alias="Idempotency-Key"),
    db: Session = Depends(get_db),
):
    if not idempotency_key:
        raise HTTPException(400, "Idempotency-Key header required for payment operations")

    # Check for existing result
    existing = idempotency_store.get(idempotency_key, db)
    if existing:
        return existing  # Return stored result, no charge

    # Process payment (this is the only time the card is charged)
    charge = payment_gateway.charge(
        amount=payment.amount_cents,
        currency=payment.currency,
        card_token=payment.card_token,
    )

    response = {
        "charge_id": charge.id,
        "status": "succeeded",
        "amount": payment.amount_cents,
    }

    # Store result before returning (so retries get the same response)
    idempotency_store.store(idempotency_key, response, db)

    return response

The client generates a UUID before the first attempt and reuses it on every retry:

import uuid, requests, time

idempotency_key = str(uuid.uuid4())  # Generated once, never regenerated

for attempt in range(5):
    try:
        response = requests.post(
            "https://api.example.com/payments",
            json={"amount_cents": 5000, "currency": "USD", "card_token": "tok_xyz"},
            headers={"Idempotency-Key": idempotency_key},
            timeout=10,
        )
        response.raise_for_status()
        print(f"Payment succeeded: {response.json()['charge_id']}")
        break
    except (requests.Timeout, requests.ConnectionError) as e:
        if attempt < 4:
            time.sleep(2 ** attempt)  # exponential backoff
        else:
            raise

Example 2: Idempotent Ansible tasks and Terraform resources

Ansible's design philosophy is that every task must be idempotent — running the playbook twice should be equivalent to running it once:

# IDEMPOTENT: creates the user if absent, no-op if already exists
- name: Create deploy user
  ansible.builtin.user:
    name: deploy
    shell: /bin/bash
    state: present
    create_home: yes

# IDEMPOTENT: sets authorized key; using exclusive=no means it won't
# remove other keys; the task detects no change on second run
- name: Add SSH key for deploy user
  ansible.posix.authorized_key:
    user: deploy
    key: "{{ lookup('file', 'files/deploy_rsa.pub') }}"
    state: present

# NOT IDEMPOTENT (common mistake): shell command runs every time
- name: Initialize database (BAD — runs on every playbook execution)
  ansible.builtin.shell: psql -c "INSERT INTO config VALUES ('initialized', 'true')"

# IDEMPOTENT: uses creates= to skip if marker file exists
- name: Initialize database (GOOD)
  ansible.builtin.shell: >
    psql -c "INSERT INTO config VALUES ('initialized', 'true')"
    && touch /var/lib/app/.db-initialized
  args:
    creates: /var/lib/app/.db-initialized

Terraform's aws_s3_bucket resource is inherently idempotent — Terraform stores the bucket name in state and uses if-not-exists semantics on apply:

resource "aws_s3_bucket" "artifacts" {
  bucket = "mycompany-build-artifacts"
  # Run terraform apply 100 times — creates on first run, no-op on all others
}

Terraform's state file tracks what exists; the plan computes the diff; apply executes only the delta. Idempotency is the property that makes terraform apply safe to run in a CI pipeline on every commit.

The Junior vs Senior Gap

Junior Senior
Adds retry logic without idempotency guarantees; creates duplicate records or double-charges Designs idempotency into the API contract before writing retry logic
Uses POST for all write operations; retries create duplicates Uses PUT for create-or-replace operations; reserves POST for operations that generate server-side IDs with application-level idempotency keys
Treats Ansible shell tasks as idempotent because "it's just a script" Uses creates:, when:, or native idempotent modules; avoids shell and command for operations with side effects
Implements exactly-once processing in a message consumer without understanding the infrastructure's delivery guarantees Designs message consumers to be idempotent first; treats exactly-once as a nice-to-have optimization, not a correctness dependency
Sets idempotency key TTL to 5 minutes; retries within 6 minutes fail silently Sets TTL based on maximum retry window plus safety margin; monitors key expiry and retry rates
Considers idempotency an optimization Considers idempotency a correctness requirement for any system with retries or at-least-once delivery

Connections

  • Complements: Circuit Breaker (use together for — circuit breakers determine when to retry; idempotency makes it safe to do so; without both, retries either never happen (no circuit breaker) or cause duplicate side effects (no idempotency))
  • Complements: Event Sourcing (use together for — event consumers in at-least-once delivery systems must apply events idempotently; the event store's offset tracking enables consumers to detect and skip duplicate deliveries; idempotency is not optional in event-sourced architectures)
  • Tensions: Bulkhead (contradicts when — bulkheads that drop requests rather than queuing them may cause clients to retry dropped requests; if the bulkhead sheds load by rejecting requests with HTTP 429 and the client retries immediately, the bulkhead's load-shedding is negated; idempotent retries must be paired with client-side backoff to respect bulkhead pressure)
  • Topic Packs: distributed-systems, ansible, terraform
  • Case Studies: firmware-update-boot-loop (non-idempotent firmware update scripts that re-apply updates on each boot, triggering a loop; an idempotent check — "is version X already installed?" — would have prevented the loop)