From 0063a8f8c6167a1fff4d910af2be767771be866c Mon Sep 17 00:00:00 2001 From: Thong Kuah <tkuah@gitlab.com> Date: Thu, 26 Sep 2024 03:00:53 +0000 Subject: [PATCH 1/2] Clarify work needed for clusterwide tables Marking a table as `gitlab_main_clusterwide` is just the first step. We will need to do more. Sketch this what this more work entails. --- doc/development/cells/index.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/doc/development/cells/index.md b/doc/development/cells/index.md index c5fc80879eb81f..3ec32d6c716f48 100644 --- a/doc/development/cells/index.md +++ b/doc/development/cells/index.md @@ -12,11 +12,16 @@ For background of GitLab Cells, refer to the [design document](https://handbook. Depending on the use case, your feature may be [cell-local or clusterwide](https://handbook.gitlab.com/handbook/engineering/architecture/design-documents/cells/#how-do-i-decide-whether-to-move-my-feature-to-the-cluster-cell-or-organization-level) and hence the tables used for the feature should also use the appropriate schema. -When you choose the appropriate schema for tables, consider the following guidelines as part of the [Cells](https://handbook.gitlab.com/handbook/engineering/architecture/design-documents/cells/) architecture: +When you choose the appropriate [schema](../database/multiple_databases.md#gitlab-schema) for tables, consider the following guidelines as part of the [Cells](https://handbook.gitlab.com/handbook/engineering/architecture/design-documents/cells/) architecture: - Default to `gitlab_main_cell`: We expect most tables to be assigned to the `gitlab_main_cell` schema by default. Choose this schema if the data in the table is related to `projects` or `namespaces`. - Consult with the Tenant Scale group: If you believe that the `gitlab_main_clusterwide` schema is more suitable for a table, seek approval from the Tenant Scale group. This is crucial because it has scaling implications and may require reconsideration of the schema choice. +Tables with `gitlab_main_clusterwide` schema will need additional work to be replicated to other / all cells. +The replication strategy will likely be different for each case, but will involve internal APIs. +The application may also need to be modified to restrict writes to prevent conflicts. +We may also ask teams to update tables from `gitlab_main_clusterwide` to `gitlab_main_cell` as required. + To understand how existing tables are classified, you can use [this dashboard](https://manojmj.gitlab.io/tenant-scale-schema-progress/). After a schema has been assigned, the merge request pipeline might fail due to one or more of the following reasons, which can be rectified by following the linked guidelines: -- GitLab From 772f2c520bf9f0ee1716da1c07aefa759a7b0de7 Mon Sep 17 00:00:00 2001 From: Thong Kuah <tkuah@gitlab.com> Date: Thu, 26 Sep 2024 08:48:14 +0000 Subject: [PATCH 2/2] Apply reviewer feedback --- doc/development/cells/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/development/cells/index.md b/doc/development/cells/index.md index 3ec32d6c716f48..77dac8f737f6a8 100644 --- a/doc/development/cells/index.md +++ b/doc/development/cells/index.md @@ -20,7 +20,7 @@ When you choose the appropriate [schema](../database/multiple_databases.md#gitla Tables with `gitlab_main_clusterwide` schema will need additional work to be replicated to other / all cells. The replication strategy will likely be different for each case, but will involve internal APIs. The application may also need to be modified to restrict writes to prevent conflicts. -We may also ask teams to update tables from `gitlab_main_clusterwide` to `gitlab_main_cell` as required. +We may also ask teams to update tables from `gitlab_main_clusterwide` to `gitlab_main_cell` as required, which also might require adding sharding keys to these tables. To understand how existing tables are classified, you can use [this dashboard](https://manojmj.gitlab.io/tenant-scale-schema-progress/). -- GitLab