Most articles about idempotency stop at "use an idempotency key." That's necessary but not sufficient — the interesting bugs show up in what happens between the client sending a request and getting a response back.
The problem in one sentence
A client sends a payment request, the network times out before the response arrives, the client retries — and now you have two requests for one payment, with no way to tell if the first one actually succeeded.
Idempotency keys are a contract, not a checkbox
The key itself is easy: generate a UUID client-side, attach it as a header, store it against the result server-side. The part people skip is defining the key lifecycle:
- How long do you keep a key's result cached? Long enough to cover realistic retry windows, short enough not to bloat storage forever.
- What happens if the same key arrives with a different payload? That's not a retry, that's a bug or an attack — reject it.
- Do you return the cached response, or just a "already processed" signal? For payments, you almost always want the original response back verbatim.
The client-side state machine that actually matters
The idempotency key solves the server side. The half people miss is the client:
- Generate the key once, before the first attempt — not per retry.
- Track request state locally:
pending → success | failed | unknown. - On
unknown(timeout, connection drop), retry with the same key. Never generate a new one for a retry. - Only give up and surface an error to the user after retries are exhausted — and even then, keep the key around so a manual retry doesn't create a duplicate.
Where this bit us
The failure mode I actually saw wasn't the server — it was a mobile client regenerating a fresh UUID every time the retry button was tapped, because the "generate key" call sat inside the retry function instead of outside it. Idempotency keys only work if the key survives the retry, not just the request.
Takeaway
Idempotency isn't a header you add — it's an agreement between client and server about what "the same request" means, and most bugs live in the client forgetting its own definition.