Allow Sidekiq workers to be disabled by default by configuration
What does this MR do and why?
Until now, we enabled all workers every minute when a node was connected as a primary or non-geo node; with this MR we now only enable the needed geo workers. This results in the following two effects:
- we can disable the offline cloud license provision worker for SaaS per configuration
- manually disabled workers stay disabled
We also fix issues like Disabling sidekiq scheduled cron job worker get... (#454972)). To achieve these goals, we changed the behavior of the cron manager in the following way:
Jobs that get disabled
Mode \ State | Without activated feature flag | With activated feature flag |
---|---|---|
Primary | secondary geo jobs | secondary geo jobs |
Secondary | all jobs except geo jobs and secondary jobs | all jobs except geo jobs and secondary jobs |
Non-geo | all geo jobs | all geo jobs |
Jobs that get enabled
Mode \ State |
Without activated feature flag | With activated feature flag |
---|---|---|
Primary | all jobs expect secondary jobs |
all general geo jobs and primary jobs (this is a change; we don't enable non-geo workers any longer) |
Secondary | all general geo jobs and secondary jobs | all general geo jobs and secondary jobs |
Non-geo | all jobs except geo jobs |
all jobs mentioned in the cron manager (this is a change; we don't enable non-geo workers any longer) |
We activate all workers during rake set_secondary_as_primary
and then also reinitialize cron workers with the default configuration, so that they we can use any defined status
attribute in the configuration.
Rollout of the introduced feature flag: https://gitlab.com/gitlab-org/gitlab/-/issues/502865
Cleanup of the introduced feature flag: https://gitlab.com/gitlab-org/gitlab/-/issues/502867
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
How to set up and validate locally
Promotion from secondary to primary geo node
Feature flag: stop_bulk_sidekiq_job_activation
Fast testing
- Disable a Sidekiq worker under http://localhost:3000/admin/background_jobs
- Execute
Gitlab::SidekiqConfig::CronJobInitializer.execute
with feature flag disabled and check that the disabled Sidekiq worker got enabled - Execute
Gitlab::SidekiqConfig::CronJobInitializer.execute
with feature flag enabled and check that the disabled Sidekiq worker stayed disabled
Extended testing
The following validation instructions cover everything to test added logic to the promotion Rake task. You can also follow the process here to promote a secondary node to a primary node.
- Create an offline license at Zuora and download the license file.
- Create an environment variable for the license with
export GITLAB_LICENSE_KEY=$(cat /path/to/your/premium.gitlab-license)
- Setup two geo nodes with the following command
curl "https://gitlab.com/gitlab-org/gitlab-development-kit/-/raw/main/support/geo-install" | bash
- Go to the secondary geo node on http://localhost:3001/admin/background_jobs to check that all the non-geo workers are deactivated.
- Promote the secondary instance
- Change to the secondary node with
cd gdk2
- Promote the database:
pg_ctl promote -D /<path>/gdk2/postgresql/data
- Promote the instance:
rake set_secondary_as_primary
executed in the GitLab directory - Checkout the branch of this MR
- Promote the database:
- Change to the secondary node with
- Go to http://localhost:3001/admin/background_jobs to check that all the non-geo workers got activated.
Resolves https://gitlab.com/gitlab-org/gitlab/-/issues/488887