Skip to content

Export Batched Background Migration info after sampling

This MR is required to Estimate length of capped batched background migrations

What does this MR do and why?

Some extra information needs to be exported to calculate how long a Batched Background Migration could take to run in a production environment.

Changes includes:

  • Adds a new observer to export Batch details to tmp/migration-testing/:db/background_migrations/MigrationName/:batch/batch-details.json
  • Changes Gitlab::Database::Migrations::TestBatchedBackgroundRunner to export migration info to tmp/migration-testing/:db/background_migrations/MigrationName/details.json
  • Prevent meta info to be exported in Gitlab::Database::Migrations::Observation

tmp/migration-testing/:db/background_migrations/MigrationName/details.json will contain:

  • interval: Interval between batches;
  • total_tuple_count: Number of tuples in the table;
  • max_batch_size: Configured max number of batches for migration;

tmp/migration-testing/:db/background_migrations/MigrationName/:batch/batch-details.json will contain:

  • time_spent: Time required to process the batch;
  • min_value: The value in the column the batching will begin at;
  • max_value: The value in the column the batching will end at, defaults to SELECT MAX(batch_column);
  • batch_size: Maximum number of rows per batch;
  • sub_batch_size: Maximum number of rows processed per iteration within the batch;
  • pause_ms: ;

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

How to set up and validate locally

  1. Create a new Batched Background Migration: bundle exec rails g post_deployment_migration test_sampling
class TestSampling < Gitlab::Database::Migration[2.1]
  MIGRATION = 'TestSamplingProcessor'
  TABLE_NAME = :issues
  BATCH_COLUMN = :id
  BATCH_SIZE = 500
  SUB_BATCH_SIZE = 100

  restrict_gitlab_migration gitlab_schema: :gitlab_main

  def up
    queue_batched_background_migration(
      MIGRATION,
      TABLE_NAME,
      BATCH_COLUMN,
      batch_size: BATCH_SIZE,
      sub_batch_size: SUB_BATCH_SIZE,
      job_interval: 2.minutes
    )
  end

  def down
    delete_batched_background_migration(MIGRATION, TABLE_NAME, BATCH_COLUMN, [])
  end
end
  1. Create the TestSamplingProcessor class under lib/gitlab/background_migration/test_sampling_processor.rb
module Gitlab
  module BackgroundMigration
    class TestSamplingProcessor < BatchedMigrationJob
      operation_name :update_all
      feature_category :database

      def perform
        each_sub_batch do |_|
          Issue.transaction do
            issue = Issue.lock.find(1)
            issue.connection.execute('SELECT * FROM pg_sleep(0.3);')
          end
        end
      end
    end
  end
end
  1. Run rails db:migrate. It will create a new Gitlab::Database::BackgroundMigration::BatchedMigration record.
  2. Because we are in in dev, Batched Migrations are executed with rails db:migrate. To sampling our migrations, we need to do a small trick first: update the record status to :active and delete any generated sub_batch.
rails c
migration = Gitlab::Database::BackgroundMigration::BatchedMigration.find_by(job_class_name: 'TestSamplingProcessor')

migration.update(status: 1) # active
migration.batched_jobs.delete_all # Delete all related jobs
  1. Execute the Sampling
from_id = migration.id - 1 # use the previous id
connection = ActiveRecord::Base.connection
result_dir = Rails.root.join('tmp', 'migration-testing').join('main', 'background_migrations')

Gitlab::Database::Migrations::TestBatchedBackgroundRunner.new(result_dir: result_dir, connection: connection, from_id: from_id).run_jobs(for_duration: 30.minutes)

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Leonardo da Rosa

Merge request reports