Add sharding key for `pool_repositories`
Sharding keys need to be set for the tables: keys, This involves choosing one of the following, based on the intended behaviour of the table: * **The table is not cell-local** * Set `gitlab_schema` to `gitlab_main_clusterwide`. * **The table is cell-local and requires a sharding key** * Set `gitlab_schema` to `gitlab_main_cell` * Add a `sharding_key` or `desired_sharding_key` configuration. If the configuration is known but the chosen key doesn't yet meet not-null and foreign key requirements, you can add an exception to `allowed_to_be_missing_not_null` or `allowed_to_be_missing_foreign_key` to get the pipeline passing. Please link to a follow-up issue in a code comment next to the exception. * You may also need to set `allow_cross_joins`, `allow_cross_transactions` and `allow_cross_foreign_keys` if changing the schema causes pipeline failures. See [`db/docs/epics.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/db/docs/epics.yml?ref_type=heads#L12-17) for an example. * **The table is cell-local and does not require a sharding key** * Set `gitlab_schema` to `gitlab_main_cell_local` and * No foreign key references to/from organization tables ### Documentation * [Choosing either the gitlab_main_cell or gitlab_main_clusterwide schema](https://docs.gitlab.com/ee/development/database/multiple_databases.html#choose-either-the-gitlab_main_cell-or-gitlab_main_clusterwide-schema) * [Defining a sharding key for all cell-local tables](https://docs.gitlab.com/ee/development/database/multiple_databases.html#defining-a-sharding-key-for-all-cell-local-tables) * [Defining a desired_sharding_key to automatically backfill a sharding_key](https://docs.gitlab.com/ee/development/database/multiple_databases.html#define-a-desired_sharding_key-to-automatically-backfill-a-sharding_key) </details> ### Summary This issue has many comments and has changed directions a couple of times. Here's my attempt at a summary: - The work was started, ran into a few data migration issues, stopped, changed direction, and started again. That's what makes following the threads below difficult. - We have ultimately decided we do want to add an organization_id and backfill it. - The work is picked up for 18.5 to 18.6 (has to be split across two releases) - We have orphaned data on the `pool_repositories` table where we don't know how to get the organization_id. We are OK with setting the orphaned data to organization_id 1 for now, and figuring out how we can trace back to a proper organization id via Gitaly at a later date. - Gitaly slack discussion: https://gitlab.slack.com/archives/C3ER3TQBT/p1759325108230759 - Related issue for the Gitaly aspect here: https://gitlab.com/gitlab-org/gitlab/-/issues/573591 - Setting to organization_id 1 discussion: https://gitlab.slack.com/archives/C0609EXHX6F/p1759348241265409 - We discussed the possibility of deleting orphaned data but we are not doing it at the moment, see organization_id discussion for more details. - There is existing work that can be leveraged for the migration: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/181158/diffs (thank you @olaoluro for this, it will be very useful).
issue