Make sure newly created tables have either `gitlab_main_cell` or `gitlab_main_clusterwide` schema set
While we are working on Cells, we are migrating existing tables from gitlab_main
schema to either gitlab_main_cell
or gitlab_main_clusterwide
in its db/docs/table_name.yml
file, based on it's properties, so that we can find out cross-joins, cross-foreign keys and transactions.
Example: !127719 (diffs)
However, simultaneously, others teams at GitLab may introduce new tables and these table's db/docs/table_name.yml
will have gitlab_schema: gitlab_main
(unless it is a CI related table).
We want to prevent this from happening. We want to force newly created tables to have either gitlab_main_cell
or gitlab_main_clusterwide
(unless it is a CI related table) right from the beginning, so that we won't have to go and correct these at a later stage.
Having the right schema since the beginning will also help them prevent cross-joins etc right from the start rather than working on them later.
How this helps Tenant scale?
We can avoid the "moving goalpost" problem. If we do not prevent this from happening for newly created tables, there will always be a growing list of tables that we will have to fix at a later stage. Essentially, that list will never end if we aren't fast enough.
How to do this?
@OmarQunsulGitlab proposed that we can allow-list the current tables having gitlab_main
schema in this spec.
(Idea: This could probably be done by looking at the milestone
in table_name.yml
, and not allow gitlab_main
for tables having > a certain milestone. This would probably be easier to do than allow-listing by a list of names?)
This would make sure that only current tables are exempted temporarily, and for any newly created table, having gitlab_main
as schema in db/docs/table_name.yml
would fail this spec and force the developer to change this to either gitlab_main_cell
or gitlab_main_clusterwide
.
Communication:
After this spec is implemented, it is essential to communicate to all teams about the change, because any team can add a new table and we do not want them to be surprised and not not know how to proceed with solving this failure.