Add sharding key for pool_repositories

Sharding keys need to be set for the tables: keys, This involves choosing one of the following, based on the intended behaviour of the table:

  • The table is not cell-local
    • Set gitlab_schema to gitlab_main_clusterwide.
  • The table is cell-local and requires a sharding key
    • Set gitlab_schema to gitlab_main_cell
    • Add a sharding_key or desired_sharding_key configuration. If the configuration is known but the chosen key doesn't yet meet not-null and foreign key requirements, you can add an exception to allowed_to_be_missing_not_null or allowed_to_be_missing_foreign_key to get the pipeline passing. Please link to a follow-up issue in a code comment next to the exception.
    • You may also need to set allow_cross_joins, allow_cross_transactions and allow_cross_foreign_keys if changing the schema causes pipeline failures. See db/docs/epics.yml for an example.
  • The table is cell-local and does not require a sharding key
    • Set gitlab_schema to gitlab_main_cell_local and
    • No foreign key references to/from organization tables

Documentation

Summary

This issue has many comments and has changed directions a couple of times. Here's my attempt at a summary:

  • The work was started, ran into a few data migration issues, stopped, changed direction, and started again. That's what makes following the threads below difficult.
  • We have ultimately decided we do want to add an organization_id and backfill it.
  • The work is picked up for 18.5 to 18.6 (has to be split across two releases)
  • We have orphaned data on the pool_repositories table where we don't know how to get the organization_id. We are OK with setting the orphaned data to organization_id 1 for now, and figuring out how we can trace back to a proper organization id via Gitaly at a later date.
  • There is existing work that can be leveraged for the migration: !181158 (diffs) (thank you @olaoluro for this, it will be very useful).
Edited by Hunter Stewart