fix(notifications): Remove deprecated retryingSink with backoffSink and add backward compatibility
What does this MR do?
This MR completes the migration from the deprecated retryingSink to the backoffSink for handling notification retries, while maintaining full backward compatibility for existing configurations using the threshold parameter.
It removes retryingSink that was using fixed backoff periods with unlimited retries which by itself could lead to head-of-the-queue blocking and either container registry resource exhaustion or dropped events. Instead, it standardizes on backoffSink - all notification retries now use exponential backoff with configurable max retries, providing more predictable and efficient retry behavior.
In order to avoid breaking change we translate the values of retryingSink in translateBackoffParams() function to equivalent maxretries values.
Translation logic:
- Calculates a time window of 120 * backoff_duration seconds (based on GitLab's production configuration)
- Simulates exponential backoff to determine how many retries fit within this window
- Ensures the translated value is never less than the original threshold to maintain minimum retry guarantees
I am marking this as a fix because unlimited retries in the old retryingSink can lead to stability issues, and the configuration translation, even though it is a small change in behaviour, is actually a fix that improves container-registry stability and resilience. Customers willing to regain full control of notifications retries should update their configuration to use maxretries at their conveniance. Those that do not need it can stay with defaults/translation behaviour which should be fine for most cases.
Related to #1244 (closed)
Related to #1243 (closed)
Author checklist
- Assign one of conventional-commit prefixes to the MR.
-
fix: Indicates a bug fix, triggers a patch release. -
feat: Signals the introduction of a new feature, triggers a minor release. -
perf: Focuses on performance improvements that don't introduce new features or fix bugs, triggers a patch release. -
docs: Updates or changes to documentation. Does not trigger a release. -
style: Changes that do not affect the code's functionality. Does not trigger a release. -
refactor: Modifications to the code that do not fix bugs or add features but improve code structure or readability. Does not trigger a release. -
test: Changes related to adding or modifying tests. Does not trigger a release. -
chore: Routine tasks that don't affect the application, such as updating build processes, package manager configs, etc. Does not trigger a release. -
build: Changes that affect the build system or external dependencies. May trigger a release. -
ci: Modifications to continuous integration configuration files and scripts. Does not trigger a release. -
revert: Reverts a previous commit. It could result in a patch, minor, or major release.
-
-
MR contains database changes including schema/background migrations: - Do not include code that depends on the schema migrations in the same commit. Split the MR into two or more.
- Do not include code that depends on background migrations in the same release.
-
Manually run up and down migrations in a postgres.ai production database clone and add a link for the query plan(s) to the MR. -
If adding new schema migrations make sure the REGISTRY_SELF_MANAGED_RELEASE_VERSIONCI variable in migrate.yml is pointing to the latest GitLab self-managed released registry version. Find the correct registry version here. Make sure to select the branch of the latest GitLab release. -
If adding new queries, extract a query plan from postgres.ai and post the link here. If changing existing queries, also extract a query plan for the current version for comparison. -
I do not have access to postgres.ai and have made a comment on this MR asking for these to be run on my behalf.
-
-
If adding new background migration, follow the guide for performance testing new background migrations and add a report/summary to the MR with your analysis.
-
Change contains a breaking change - apply the breaking change label. -
Change is considered high risk - apply the label high-risk-change -
I created or linked to an existing issue for every added or updated TODO,BUG,FIXMEorOPTIMIZEprefixed comment -
Changes cannot be rolled back -
Apply the label cannot-rollback. -
Add a section to the MR description that includes the following details: -
The reasoning behind why a release containing the presented MR can not be rolled back (e.g. schema migrations or changes to the FS structure) -
Detailed steps to revert/disable a feature introduced by the same change where a migration cannot be rolled back. (note: ideally MRs containing schema migrations should not contain feature changes.) -
Ensure this MR does not add code that depends on these changes that cannot be rolled back.
-
-
Documentation/resources
Reviewer checklist
-
Ensure the commit and MR tittle are still accurate. -
If the change contains a breaking change, verify the breaking change label. -
If the change is considered high risk, verify the label high-risk-change -
Identify if the change can be rolled back safely. (note: all other reasons for not being able to rollback will be sufficiently captured by major version changes).