Ensure half complete batched background migrations are correctly picked up after the CI decomposition failover

Read more about the failover process at https://about.gitlab.com/handbook/engineering/development/enablement/sharding/migrate-ci-tables-to-new-database-plan.html#rolling-back . The TL;DR is that we're replicating the whole database to a separate ci database and at a certain point we'll stop the replication and split our reads and writes for all ci related tables as defined by https://gitlab.com/gitlab-org/gitlab/blob/master/lib/gitlab/database/gitlab_schemas.yml to go to the new CI patroni cluster. Since background migrations are considered gitlab_shared tracking tables then this presents a tricky situation where a half complete migration will effectively be split across 2 databases after the failover. We need to ensure:

We don't duplicate our background migration efforts running the migration wastefully against the wrong database with stale data
We don't miss a background migration after the failover
We don't run the background migration against stale data on the wrong database

Proposal: Store the `allowed_schemas_for_connection` alongside the background migration data and ensure it's picked up by the correct worker before/after the migration

From #359951 (comment 928357451)

Since the background migrations are always scheduled from regular migration, with the new Migration[2.0] we do know the schema from which is being scheduled (like gitlab_main):

We could store this information in tracking table.
The worker then would only query background tasks for where(gitlab_schema: [allowed_schemas_for_connection]).
Now, since we will enqueue background migrations ahead of time with Migration[2.0] it would hold a correct gitlab_schema, since we already know the schema that is used for querying data
Now, once we split the databases the new CI Worker would only fetch the migrations with the correct schema.
The one outcome would be some stale records with schema outside of a given database, but this could be cleaned up in a follow-up phase when we will be cleaning up unrelated data
Any background migrations that need to be run on gitlab_shared tables will need to be scheduled twice (once for ci and once for main) but it may be simpler before CI decomposition to simply forbid any migration that needs to backfill/update data on gitlab_shared tables as I assume this is a rare edge case we probably won't actually need to do in the next couple of months as these are only a few tables

Edited Apr 29, 2022 by Dylan Griffith

Ensure half complete batched background migrations are correctly picked up after the CI decomposition failover

Proposal: Store the allowed_schemas_for_connection alongside the background migration data and ensure it's picked up by the correct worker before/after the migration

Proposal: Store the `allowed_schemas_for_connection` alongside the background migration data and ensure it's picked up by the correct worker before/after the migration