Skip to content

ScheduleSettingChangedUpdateWorker performance issue

The newly introduced ScheduleSettingChangedUpdateWorker is exhibiting performance issues, as shown in these logs. These issues are suspected to be contributing to the inc-2131-primary-db-saturation-causing-sidekiq-backlogging incident in production.

To mitigate this, we should:

  • Reduce the input size for ScheduleSettingChangedUpdateWorker. Currently, it processes all project ids from SetGroupSecretPushProtectionService, which can include thousands of entries. In most cases, a much smaller subset is likely sufficient.
  • Add a scheduling delay to stagger execution. This may help avoid repeated updates to the same namespace-level records, particularly for top-level groups, when counter propagation takes place.
  • Use defer_on_database_health_signal to defer execution on db health signals. This could help us prevent overloading the DB with additional stress.
  • Wrap the worker logic in a dedicated feature flag to allow gradually bringing this worker back while monitoring its performance impact.
Edited by Gal Katz