1. The “Payload Mismatch” Trap
This is the most frequent implementation bug. It occurs when a client reuses an Idempotency Key but changes the request parameters.
-
The Scenario:
- Client sends
POST /transfer { amount: 100 }with Key:uuid-1. - Server processes it and saves the result.
- Client (due to a bug or malice) sends
POST /transfer { amount: 9000 }with Key:uuid-1.
- Client sends
-
The Gotcha: A naive implementation just checks “Does
uuid-1exist?” It sees “Yes, status: COMPLETED” and returns the saved success response from the $100 transfer.- Result: The client thinks they successfully transferred 100 moved.
-
The Fix: You must store a hash (checksum) of the request body alongside the key. If the key exists but the hash doesn’t match the current request, throw a 422 Unprocessable Entity or 409 Conflict.
2. The “Burned Key” on Failure
Deciding what to do when the first attempt fails is tricky.
-
The Scenario: Client sends a request. The database is temporarily down. The server catches the exception and records the key status as
FAILED. -
The Gotcha: The client retries the request (as they should for a 500 error). The server sees the key exists with status
FAILEDand returns the error again—forever. You have effectively “burned” the key on a transient error. -
The Fix:
- Transient Errors (Network/DB Connection): Do not save the key, or roll back the transaction entirely so the key is never persisted. Allow the retry to proceed as a fresh request.
- Terminal Errors (Validation, Business Logic): Save the key as
FAILED. If the client retries, they should get the same validation error.
3. Namespace Collisions (Data Leaks)
This is a critical security vulnerability.
- The Scenario: You rely solely on the
Idempotency-Keyheader for uniqueness.- User A sends Key:
order-1. - User B (malicious or accidental) sends Key:
order-1.
- User A sends Key:
- The Gotcha: The server sees
order-1is completed and returns the cached response. User B just received User A’s order confirmation details, including potential PII. - The Fix: Never use the Idempotency Key as the global primary key. The composite primary key must be
{user_id, idempotency_key}. User B cannot access User A’s keys.
4. The “Zombie Worker” (TTL Exhaustion)
This applies if you use Redis/Memcached locks instead of database transactions.
-
The Scenario:
- Worker A locks Key
Xwith a 30-second TTL (Time-To-Live). - Worker A gets stuck in a Garbage Collection pause or slow network call for 35 seconds.
- Redis expires the lock.
- Worker B picks up the retry, locks Key
X, and starts processing. - Worker A wakes up and finishes processing.
- Worker A locks Key
-
The Gotcha: Both workers process the transaction. You have double-charged the customer despite having an “idempotency” lock.
-
The Fix: Use a Fencing Token.
- When acquiring a lock, increment a token (e.g., version 1, version 2).
- When performing the final write/side-effect, check that the token hasn’t been superseded. If the database sees a write from Token 1 but Token 2 already exists, reject the write.
5. Client-Side Key Rotation
Idempotency relies on the client behaving correctly.
- The Scenario: The client sends a request. The server processes it but the response times out (network cut). The client library sees a timeout.
- The Gotcha: The client code catches the timeout, generates a new UUID, and retries.
- Result: Since the key is new, the server treats it as a new request. Exactly-once processing is broken because the client failed to hold onto the original key.
- The Fix: This is a documentation and client-library issue. You must educate consumers that retries must reuse the same key.
6. The “Resource Deleted” Race
Returning a cached response blindly can be confusing if the world changed in the meantime.
-
The Scenario:
- User creates
Order-1(idempotent). Success. - User deletes
Order-1. - User retries the creation of
Order-1(maybe an old browser tab refreshed).
- User creates
-
The Gotcha: The idempotency system sees the key
Order-1was “successfully created” in the past and returns200 OKwith the order details. The user thinks the order is back, but it doesn’t actually exist in theorderstable anymore. -
The Fix: This is a philosophical design choice.
- Strict Idempotency: Return the original success (technically correct: “At time T, this succeeded”).
- State-Aware Idempotency: Check if the resulting resource still exists. If not, return
404or410 Gone. (This is harder to implement as it breaks the separation of concerns).