Parallelize metadata background migration on partition 100
What does this MR do and why?
This MR splits the background migration running on gitlab_partitions_dynamic.ci_builds into two separate background migrations that can be run in parallel by the workers. It works around the framework limitations (only one active migration per table) by creating two database views for different ranges of data on the same table. This updates the progress reporting so it should go from one migration that completes in 7 months to 4.
Risks: Because we use views, the migration will not be throttled by the autovacuum, so it could cause bloat on the builds table. We've disabled the autovacuum signal before on this table for some bigint column conversion mitigation and it didn't cause any problems. We still have the WAL signal running, so that will throttle if need.
Pros: Because we're not throttled by the vacuum, it should finish faster, but not under the 3 months mark.
We have 4 workers that are capable of running migrations and with this split we'll utilize all of them.
In the last week, 9% of the time was spent on hold: https://log.gprd.gitlab.net/app/r/s/Kk6zJ
References
The migration was added in !208674 (merged) and improved in !214248 (merged) to batch insert job definitions. The sub batch size was bumped in !221934 (merged)
Screenshots or screen recordings
Before
gitlabhq_dblab=# select * from batched_background_migrations where id = 3000518;
-[ RECORD 1 ]------------+------------------------------------
id | 3000518
created_at | 2025-11-21 07:10:56.168544+00
updated_at | 2026-01-30 00:42:36.243259+00
min_value | 4
max_value | 12168556334
batch_size | 27167
sub_batch_size | 100
interval | 120
status | 1
job_class_name | MoveCiBuildsMetadata
batch_class_name | PrimaryKeyBatchingStrategy
table_name | gitlab_partitions_dynamic.ci_builds
column_name | id
job_arguments | ["partition_id", [100]]
total_tuple_count | 4774979600
pause_ms | 100
max_batch_size |
started_at | 2026-01-09 12:02:51.701006+00
on_hold_until | 2026-01-29 18:39:55.715401+00
gitlab_schema | gitlab_ci
finished_at |
queued_migration_version |
min_cursor |
max_cursor |
After
gitlabhq_dblab=# SELECT "batched_background_migrations".* FROM "batched_background_migrations" WHERE "batched_background_migrations"."table_name" IN ('gitlab_partitions_dynamic.ci_builds_views_100_1', 'gitlab_partitions_dynamic.ci_builds_views_100_2', 'gitlab_partitions_dynamic.ci_builds_views_100_3', 'gitlab_partitions_dynamic.ci_builds_views_100_4') ORDER BY "batched_background_migrations"."id" ASC;
id | created_at | updated_at | min_value | max_value | batch_size | sub_batch_size | interval | status | job_class_name | batch_class_name | table_name | column_name | job_arguments | total_tuple_count | pause_ms | max_batch_size | started_at | on_hold_until | gitlab_schema | finished_at | queued_migration_version | min_cursor | max_cursor
---------+-------------------------------+-------------------------------+------------+-------------+------------+----------------+----------+--------+----------------------+----------------------------+-------------------------------------------------+-------------+-------------------------+-------------------+----------+----------------+-------------------------------+-------------------------------+---------------+-------------+--------------------------+------------+------------
3000518 | 2025-11-21 07:10:56.168544+00 | 2026-02-06 08:42:28.6303+00 | 4 | 1500384395 | 25285 | 100 | 120 | 1 | MoveCiBuildsMetadata | PrimaryKeyBatchingStrategy | gitlab_partitions_dynamic.ci_builds_views_100_1 | id | ["partition_id", [100]] | 1193744900 | 100 | | 2026-01-09 12:02:51.701006+00 | 2026-02-06 08:52:28.581265+00 | gitlab_ci | | | |
3000559 | 2026-02-06 15:12:19.056818+00 | 2026-02-06 15:12:19.056818+00 | 1500384395 | 2951960143 | 1000 | 250 | 120 | 1 | MoveCiBuildsMetadata | PrimaryKeyBatchingStrategy | gitlab_partitions_dynamic.ci_builds_views_100_2 | id | ["partition_id", [100]] | 1193744900 | 100 | | 2026-02-06 15:12:18.856577+00 | | gitlab_ci | | | |
3000560 | 2026-02-06 15:12:20.493337+00 | 2026-02-06 15:12:20.493337+00 | 2951960143 | 4355055910 | 1000 | 250 | 120 | 1 | MoveCiBuildsMetadata | PrimaryKeyBatchingStrategy | gitlab_partitions_dynamic.ci_builds_views_100_3 | id | ["partition_id", [100]] | 1193744900 | 100 | | 2026-02-06 15:12:20.288676+00 | | gitlab_ci | | | |
3000561 | 2026-02-06 15:12:21.896789+00 | 2026-02-06 15:12:21.896789+00 | 4355055910 | 12168556334 | 1000 | 250 | 120 | 1 | MoveCiBuildsMetadata | PrimaryKeyBatchingStrategy | gitlab_partitions_dynamic.ci_builds_views_100_4 | id | ["partition_id", [100]] | 1193744900 | 100 | | 2026-02-06 15:12:21.702796+00 | | gitlab_ci | | | |
(4 rows)
How to set up and validate locally
bin/rails g post_deployment_migration FinalizeBuildsMetadataMig- Add this as contents:
class FinalizeBuildsMetadataMig < Gitlab::Database::Migration[2.3]
milestone '18.9'
restrict_gitlab_migration gitlab_schema: :gitlab_ci
disable_ddl_transaction!
def up
finalize_batched_background_migration(
job_class_name: 'MoveCiBuildsMetadata',
table_name: 'gitlab_partitions_dynamic.ci_builds_views_100_2',
column_name: 'id',
job_arguments: ['partition_id', [100]])
end
def down
end
end
- Run it
GITLAB_SIMULATE_SAAS=1 DISABLE_POSTGRES_PARTITION_CREATION_ON_STARTUP=1 pgai use -o ci -- bin/rails db:migrate:ci - Check the log files, the migration should be running
- use
ps ax | grep migrateto find the pid andkill -9it to stop it, or delete the clone instead.
Chatops commands
For verifying status and progress: /chatops run batched_background_migrations list --job-class-name=MoveCiBuildsMetadata --database=ci
To pause a specific migration: /chatops run batched_background_migrations pause 3000519 --database=ci
To resume a specific migration: /chatops run batched_background_migrations resume 3000519 --database=ci
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.