Add the BG migration worker for the ci database
What does this MR do and why?
Related to #343047 (closed)
Adds a new worker to process background migrations that target the ci
database.
Since the worker enqueues jobs based on the database name, in a single database setup all background migrations will continue to be executed by BackgroundMigrationWorker
(database name will always be main
). This simplifies management of background migrations for self-managed customers, who won't operate decomposed databases.
The job won't be used yet on GitLab.com either, at least until we have migration tooling to run DML within the gitlab_schema
. At that point, we could enable the new working by updating https://gitlab.com/gitlab-org/gitlab/-/blob/a4e148198fe318e4caaab61efac727efa42e3d0b/lib/gitlab/database/migrations/background_migration_helpers.rb#L190 to use the current connection name of the migration.
How to set up and validate locally
In a multi-database rails console, like GITLAB_USE_MODEL_LOAD_BALANCING=true rails c
- Create a test migration job:
module Gitlab module BackgroundMigration class MyTestMigration < BaseJob def perform(start_id, stop_id, table_name) num_rows = connection.select_value("select count(*) from #{table_name}") puts "#{num_rows} rows in #{table_name} on #{connection.pool.db_config.name} database" mark_jobs_as_succeeded(start_id, stop_id, table_name) end private def mark_jobs_as_succeeded(*arguments) Gitlab::Database::BackgroundMigrationJob.mark_all_as_succeeded(self.class.name.demodulize, arguments) end end end end
- Schedule a job on both the
main
andci
database:main_coordinator = Gitlab::BackgroundMigration.coordinator_for_database('main') ci_coordinator = Gitlab::BackgroundMigration.coordinator_for_database('ci') main_coordinator.perform_in(1.hour, 'MyTestMigration', [1, 100, 'projects']) Gitlab::Database::SharedModel.using_connection(Project.connection) do Gitlab::Database::BackgroundMigrationJob.create!(class_name: 'MyTestMigration', arguments: [1, 100, 'projects']) end ci_coordinator.perform_in(1.hour, 'MyTestMigration', [1, 100, 'ci_builds']) Gitlab::Database::SharedModel.using_connection(Ci::Build.connection) do Gitlab::Database::BackgroundMigrationJob.create!(class_name: 'MyTestMigration', arguments: [1, 100, 'ci_builds']) end
- Verify the scheduled set and tracking records:
Sidekiq::ScheduledSet.new.select { |j| j.args.first == 'MyTestMigration' } select status, arguments from background_migration_jobs where class_name = 'MyTestMigration'; -- on main status | arguments --------+----------------------- 0 | [1, 100, "projects"] -- on ci status | arguments --------+----------------------- 0 | [1, 100, "ci_builds"]
- Run the jobs:
main_coordinator.steal('MyTestMigration') # should output something like "8 rows in projects on main database" ci_coordinator.steal('MyTestMigration') # should output something like "4 rows in ci_builds on ci database"
- Verify the
status
is1
on bothmain
andci
tracking records:select status from background_migration_jobs where class_name = 'MyTestMigration';
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.