Skip to content

Add `restrict_gitlab_migration` to limit when a given migration is executed in context of decomposed databases

Kamil Trzciński requested to merge restrict-migrations into master

What does this MR do and why?

This allows to limit migrations to run only in a specific context. This is first step towards consistently marking all executed migrations with their purpose if they are modifying DDL or DML changes and how they are being executed when running on production.

The idea here is to extend migration helpers to add restrict_gitlab_migration gitlab_schema: :gitlab_main as an indicator in what context the given migration should run.

This is implementation of #342378 (closed).

What is done?

  • detect DDL and DML changes
  • implement support for restricting migration
  • detect and skip migrations
  • map databases to schemas
  • cover all changes with specs

What is done, but will be in next MR?

  • validate that migrations use proper migration helper
  • fail migrations that are not valid

What is left to be done?

  • actually skip migrations, today they will raise an error
  • documentation

How migration does look?

This migration will be run only on databases that store data for the gitlab_main. Will be skipped everywhere else.

class ScheduleRemoveDuplicateVulnerabilitiesFindings3 < Gitlab::Database::Migration[1.0]
  include Gitlab::Database::MigrationHelpers::RestrictGilabMigration

  disable_ddl_transaction!

  restrict_gitlab_migration gitlab_schema: :gitlab_main

  MIGRATION = 'RemoveDuplicateVulnerabilitiesFindings'
  DELAY_INTERVAL = 2.minutes.to_i
  BATCH_SIZE = 5_000

  def up
    queue_background_migration_jobs_by_range_at_intervals(
      define_batchable_model('vulnerability_occurrences'),
      MIGRATION,
      DELAY_INTERVAL,
      batch_size: BATCH_SIZE
    )
  end

  def down
    # no-op
  end
end

In next MR it will be done using ::Migration[2.0] to enable behavior by default:

class ScheduleRemoveDuplicateVulnerabilitiesFindings3 < Gitlab::Database::Migration[2.0]

  ...

Behavior

This describes an expected behavior. The relation between main/ci: (as a way to define distinct logical databases) and gitlab_schema::

  1. The main/ci: describe a database connection to logical database
  2. The gitlab_schema: describes affinity between tables
  3. The cross-joining gitlab_schema: is forbidden
  4. Each database connection can access only specific schemas

1. Single database (current state of on-premise and GitLab.com)

production:
  main:
    host: host-a
  1. All migrations will be executed everywhere since we only have a single DB.

2. Two separate databases (expected end state for GitLab.com and later on-premise)

production:
  main:
    host: host-a
  ci:
    host: host-b
  1. On main: and ci: all migrations that do not define restrict_gitlab_migration will be executed
  2. On main: migrations that define restrict_gitlab_migration gitlab_schema: :gitlab_main or restrict_gitlab_migration gitlab_schema: :gitlab_shared will be executed
  3. On ci: migrations that define restrict_gitlab_migration gitlab_schema: :gitlab_ci or restrict_gitlab_migration gitlab_schema: :gitlab_shared will be executed
  4. All migrations that are not executed are skipped (likely not loaded at all [yet to define])

3. Two connections pointing to the same database (intermediate state)

production:
  main:
    host: host-a
  ci:
    host: host-a
  1. On main: and ci: all migrations that do not define restrict_gitlab_migration will be executed
  2. On main: migrations that define restrict_gitlab_migration gitlab_schema: :gitlab_main or restrict_gitlab_migration gitlab_schema: :gitlab_shared will be executed
  3. On ci: migrations that define restrict_gitlab_migration gitlab_schema: :gitlab_ci or restrict_gitlab_migration gitlab_schema: :gitlab_shared will be executed
  4. All migrations that are not executed are skipped (likely not loaded at all [yet to define])

How to set up and validate locally

  1. This does not require yet feature flags as it is not enabled till restrict_gitlab_migration is specified.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Kamil Trzciński

Merge request reports