Automatically classify all `gitlab_main` tables with a derivable `sharding_key` or `desired_sharding_key` as `gitlab_main_cell`
The objective here is to ensure that we don't have any remaining gitlab_main
tables as we cannot figure out whether they need a sharding key unless we know they are in gitlab_main_cell
or gitlab_main_clusterwide
.
We should be able to classify these using an approach like:
- If they have a
project_id
ornamespace_id
orgroup_id
we assume they aregitlab_main_cell
- If they have an
issue_id
orepic_id
or other known reference to a cell-local table then we assume they are alsogitlab_main_cell
. We'll iterate here on multiple heuristic approaches to cover the majority of tables
This should generate many small MRs and assigned to the teams that own the table using an approach defined in #428459 (closed)
Handling polymorphic tables
Some polymorphic tables like members
have source_id
and source_type
and this can be used to reference a project or namespace. For these tables we could add project_id
and namespace_id
but it might be simpler to just add namespace_id
and use the id for the ProjectNamespace
of the project. It should still work for our constraint enforcement and would be inline with our longer term goal of project/namespace consolidation.