Automated retry of failed web hooks
This issue was created out of the discussion on this Community contribution MR: !72265 (closed).
Proposal
When a webhook fails, it currently does not retry. A later retry attempt of some failure reasons !72265 (comment 723881483) might succeed, and for these situations, an automated retry of the webhook would give greater insurance of webhook delivery.
Open questions
There were a number of questions raised in the discussions on !72265 (closed).
Should the retries be handled by Sidekiq, by raising an exception? And if so, what might be the effects on our Error Budgets given how frequently webhooks fire. We might want to measure what the effect could be first.
Or should the retry be handled by our own application logic?
How would webhook retries interact with the auto-disabler feature? For example, a webhook that was triggered for a single event and failed to POST to a remote server because reasons for the duration of the retry attempts (a few minutes) would make the webhook disabled. Without a retry, this wouldn't happen if the webhook hadn't been triggered.