Add `restrict_gitlab_migration` to limit when a given migration is executed in context of decomposed databases
What does this MR do and why?
This allows to limit migrations to run only in a specific context. This is first step towards consistently marking all executed migrations with their purpose if they are modifying DDL or DML changes and how they are being executed when running on production.
The idea here is to extend migration helpers to add restrict_gitlab_migration gitlab_schema: :gitlab_main
as an indicator in what context the given migration should run.
This is implementation of #342378 (closed).
What is done?
- detect DDL and DML changes
- implement support for restricting migration
- detect and skip migrations
- map databases to schemas
- cover all changes with specs
What is done, but will be in next MR?
- validate that migrations use proper migration helper
- fail migrations that are not valid
What is left to be done?
- actually skip migrations, today they will raise an error
- documentation
How migration does look?
This migration will be run only on databases that store data for the gitlab_main
.
Will be skipped everywhere else.
class ScheduleRemoveDuplicateVulnerabilitiesFindings3 < Gitlab::Database::Migration[1.0]
include Gitlab::Database::MigrationHelpers::RestrictGilabMigration
disable_ddl_transaction!
restrict_gitlab_migration gitlab_schema: :gitlab_main
MIGRATION = 'RemoveDuplicateVulnerabilitiesFindings'
DELAY_INTERVAL = 2.minutes.to_i
BATCH_SIZE = 5_000
def up
queue_background_migration_jobs_by_range_at_intervals(
define_batchable_model('vulnerability_occurrences'),
MIGRATION,
DELAY_INTERVAL,
batch_size: BATCH_SIZE
)
end
def down
# no-op
end
end
In next MR it will be done using ::Migration[2.0]
to enable behavior by default:
class ScheduleRemoveDuplicateVulnerabilitiesFindings3 < Gitlab::Database::Migration[2.0]
...
Behavior
This describes an expected behavior. The relation between main/ci:
(as a way to define distinct logical databases) and gitlab_schema:
:
- The
main/ci:
describe a database connection to logical database - The
gitlab_schema:
describes affinity between tables - The cross-joining
gitlab_schema:
is forbidden - Each database connection can access only specific schemas
1. Single database (current state of on-premise and GitLab.com)
production:
main:
host: host-a
- All migrations will be executed everywhere since we only have a single DB.
2. Two separate databases (expected end state for GitLab.com and later on-premise)
production:
main:
host: host-a
ci:
host: host-b
- On
main:
andci:
all migrations that do not definerestrict_gitlab_migration
will be executed - On
main:
migrations that definerestrict_gitlab_migration gitlab_schema: :gitlab_main
orrestrict_gitlab_migration gitlab_schema: :gitlab_shared
will be executed - On
ci:
migrations that definerestrict_gitlab_migration gitlab_schema: :gitlab_ci
orrestrict_gitlab_migration gitlab_schema: :gitlab_shared
will be executed - All migrations that are not executed are skipped (likely not loaded at all [yet to define])
3. Two connections pointing to the same database (intermediate state)
production:
main:
host: host-a
ci:
host: host-a
- On
main:
andci:
all migrations that do not definerestrict_gitlab_migration
will be executed - On
main:
migrations that definerestrict_gitlab_migration gitlab_schema: :gitlab_main
orrestrict_gitlab_migration gitlab_schema: :gitlab_shared
will be executed - On
ci:
migrations that definerestrict_gitlab_migration gitlab_schema: :gitlab_ci
orrestrict_gitlab_migration gitlab_schema: :gitlab_shared
will be executed - All migrations that are not executed are skipped (likely not loaded at all [yet to define])
How to set up and validate locally
- This does not require yet feature flags as it is not enabled till
restrict_gitlab_migration
is specified.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.