Skip to content

Track execution of BackgroundMigration jobs

Patrick Bair requested to merge background-migration-tracking into master

What does this MR do?

Background

For context about partitioning migration helpers, the related issue #218428 (closed) contains an overview of the steps to partition a table.

For the partitioning MR that will use this tracing first, see: !35201 (merged)

Overview

The partitioning migration helpers use a background migration to copy data from the source table to the partitioned table. However, due to the unreliable nature of the sidekiq jobs that power the migration, a second migration needs to be run after the background migration completes, to clean up any missed data.

This cleanup migration is particularly difficult for partitioning migrations, as partitioning will be rolled out across some of the largest tables in the database, and it's often not performant to query for missing records.

A possible solution to this problem is track the completion of each background migration job in the database. By doing this, the cleanup migration only has to clean data for those jobs which were not marked complete, rather than scan the entire table looking for missing rows.

This MR implements an initial iteration of such a strategy, which could be in the future expanded to work seamlessly with all background migrations. For each partitioning migration, it creates a record in the background_migration_jobs table, which is set with a pending status. Once the corresponding job has finished, it updates the record to have a succeeded status. In the cleanup migration, the table can be queried something like:

Gitlab::BackgroundMigrationJob.where(class_name: 'BackfillPartitionedTable').pending.each_batch do |incomplete_jobs|
  incomplete_jobs.each do |job|
    # run cleanup query for job
  end
end

The initial implementation is intentionally minimalist to fill the need for the partitioning migrations, but there is much room for improvement in future iterations. For now the behavior is opt-in since the tracking is not automated; in the future this would change to be fully integrated into all background migrations.

Migration Output

up
rails db:migrate:up VERSION=20200701205710
== 20200701205710 CreateBackgroundMigrationJobs: migrating ====================
-- table_exists?(:background_migration_jobs)
   -> 0.0004s
-- create_table(:background_migration_jobs)
   -> 0.0087s
-- transaction_open?()
   -> 0.0000s
-- execute("ALTER TABLE background_migration_jobs\nADD CONSTRAINT check_b0de0a5852\nCHECK ( char_length(class_name) <= 200 )\nNOT VALID;\n")
   -> 0.0004s
-- execute("ALTER TABLE background_migration_jobs VALIDATE CONSTRAINT check_b0de0a5852;")
   -> 0.0004s
== 20200701205710 CreateBackgroundMigrationJobs: migrated (0.0146s) ===========
down
rails db:migrate:down VERSION=20200701205710
== 20200701205710 CreateBackgroundMigrationJobs: reverting ====================
-- drop_table(:background_migration_jobs)
   -> 0.0033s
== 20200701205710 CreateBackgroundMigrationJobs: reverted (0.0033s) ===========

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team
Edited by Mayra Cabrera

Merge request reports