blundergoat
10 hours ago
We treat webhooks as at-least-once delivery over an unreliable transport and design for duplicates and out-of-order events.
A few rules that have saved us:
- Persist before responding. Never process inline. Write payload to DB, return 200 fast.
- Idempotency key required. Either provider event ID or hash the payload.
- Async worker processes from queue. Exponential backoff + max attempts.
- Dead letter queue + dashboard. Humans need visibility.
- Alert on backlog growth, not single failures. One failure is noise. A growing retry queue is signal.
- Relying on provider retries alone has bitten us more than once.
GoatPerfect
9 hours ago
Thank you so much for tips! I was feeling nervous about relying on provider retires as well. I especially like the idea of alerting on backlog growth. There's nothing I hate more than a bunch of emails and notifications!
chickensong
8 hours ago
This was a nice goat exchange