From 162ca66aef3c3a273d071916186a02e49a9407d3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kamil=20Trzci=C5=84ski?= <ayufan@ayufan.eu> Date: Thu, 27 Apr 2023 19:35:09 +0200 Subject: [PATCH 01/16] Extend Cells blueprint with Work streams, Availability and Iterations - This adds Work streams that describe Key coherent changes - Availability what we expect at each level (Experiment, Beta and GA) - Iterations on what we want to pick at a given iteration --- doc/architecture/blueprints/cells/index.md | 279 ++++++++++++++++++++- 1 file changed, 272 insertions(+), 7 deletions(-) diff --git a/doc/architecture/blueprints/cells/index.md b/doc/architecture/blueprints/cells/index.md index ab7d91f24e2a5c..20b5fef4ccb590 100644 --- a/doc/architecture/blueprints/cells/index.md +++ b/doc/architecture/blueprints/cells/index.md @@ -22,15 +22,280 @@ For more information about Cells, see also: - [Goals](goals.md) - [Cross-section impact](impact.md) +## Work streams + +We can't ship the entire Cells architecture in one go - it is too large. +Instead, we are defining key work streams required by the project. + +Not all objectives needs to be fullfiled to reach production readiness. +It is expected that some objectives will not be completed for GA, +but will be enough to run Cells in production. + +### 1. Data access layer + +Before Cells can be run in production we need to prepare codebase to accept Cells architecture. +This means: allowing data sharing between Cells, +updating tooling for discovering cross Cells data traversal, defining code practices +for cross Cells data traversal, analyzing data model to defining the data affinity. + +Under this objective the following steps are expected to happen: + +1. **Allow to share cluster-wide data with database-level data access layer.** + + Cells can connect to database containing shared data. Example: instance settings, users, routing information. + +1. **Evaluate efficiency of database-level access vs API-oriented access layer.** + + Re-consider consquences of database-level data access for data migration, resillency of updates and resillency of interconnected systems when we share only subset of data. + +1. **Cluster unique identifiers.** + + All objects have unique identifier that can be used to access data across cluster. The IDs for allocated projecs, issues and any oher objects are cluster unique. + +1. **Cluster wide-deletions.** + + Entities deleted in Cell 2, if they are cross-referenced are properly deleted or nullified across cluster. We will likely re-use existing [loose foreign keys](../../../development/database/loose_foreign_keys.md) to extend it with cross-cells data removal. + +1. **Data access layer.** + + Ensure that a stable data-access (versioned) layer is implemented that allows to share cluster-wide data. + +1. **Database migration.** + + Ensure that migrations can be run independently between Cells, and we safely handle migrations of shared data in a way that does not impact other Cells. + +### 2. Essential workflows + +To make Cells viable we require to define and support essential workflows before we can consider the Cells +to be of beta quality. Essential workflows are meant to cover majority of application functionality +that makes the product in major part useable, with possible caveats. + +The currently taken approach is to define workflows top to bottom. +The order defines presumed priority of the items. +This list is not exhaustive as we would be expecting +other teams to help and fix their workflows after +initial phase where we fix the most fundamental ones. + +It is expected that all features defined below are supported by Cells to consider project to Beta ready. +In below cases the workflows define a set of tables to be properly attributed by the feature. +For some it means having to break down table thats usage is ambigious. Example: `uploads` are used to store user avatars, and as well uploaded attachments +for comments. It would be expected that `uploads` is split into `uploads` +(as describing group/project-level attachments) and `global_uploads` (as describing ex. user avatars). + +1. **Instance-wide settings are shared across cluster.** + + The Admin Area section for most part is shared across cluster. + +1. **User accounts are shared across cluster.** + + The purpose is to make `users` cluster-wide. + +1. **User can create group.** + + The purpose is to perform targetted decomposition of `users` and `namespaces`. Since the `namespaces` will be stored cell-local. + +1. **User can create project.** + + The purpose is to perform targetted decomposition of `users` and `projects`. Since the `projects` will be stored cell-local. + +1. **User can change profile avatar that is shared in cluster.** + + The purpose is to fix global uploads that are shared in cluster. + +1. **User can push to Git repository.** + + The purpose is to ensure that essential joins from projects table are properly attributed to be + cell-local, and as result essential Git workflow is supported. + +1. **User can run CI pipeline.** + + The purpose is that `ci_pipelines` (like `ci_stages`, `ci_builds`, `ci_job_artifacts`) and adjacent tables are properly attribute to be cell-local. + +1. **User can create issue, merge request and merge it after it is green.** + + The purpose is to ensure that `issues` and `merge requests` are properly attributed to be `cell-local`. + +1. **User can manage group and project members.** + + The `members` table is properly attributed to be either `cell-local` and `cluster-wide`. + +1. **User can manage instance-wide runners.** + + The purpose is to scope all CI Runners to be cell-local. So, instance-wide runners in fact becomes cell-local runners. The expectation would be that we provide user interface to overview and manage all runners per-cell, instead of per-cluster. + +1. **User is part of organization and can only see information from the organization.** + + The purpose is to have many organizations per-cell, but never have a single organization spaning across many Cells. This is required to ensure that information shown within organization are isolated and does not require fetching information from another cells. + +### 3. Additional workflows + +Those are additional workflows, that some of them might need to be supported depending on decision. +This list is not exhaustive of work needed to be done. + +1. **User can use all group-level features.** +1. **User can use all project-level features.** +1. **User can share groups with other groups in organization.** +1. **User can create system web hook.** +1. **User can upload and manage packages.** +1. **User can manage security detection features.** +1. **User can manage Kubernetes integration.** +1. TBD + +### 4. Routing layer + +The routing layer is meant to offer consistent user experience where all Cells are presented +under a single domain (ex. `gitlab.com`) instead of +having to navigate to separate domains. + +User will able to use `https://gitlab.com` to access Cell-enabled GitLab. Depending +on URL access it will be transparently proxied to the correct Cell that can serve this particular +information. For example: + +- All requests going to `https://gitlab.com/users/sign_in` are randomly distributed to all Cells. +- All requests going to `https://gitlab.com/gitlab-org/gitlab/-/tree/master` are always directed to Cell 5. +- All requests going to `https://gitlab.com/my-username/my-project` are always directed to Cell 1. + +1. **Technology.** + + We make a decision in what technology the routing service is written. + The choice is dependent on best performing language, and expected way + and place of deployment of routing layer. If it is required to make + the service multi-cloud it might be required to deploy it to CDN provider. + Then the service needs to be written using technology compatible with CDN provider. + +1. **Cell discovery.** + + Routing service needs to be able to discover and monitor the health of all Cells. + +1. **Router endpoints classification.** + + The stateless routing service will fetch and cache information about endpoints + from one of the Cells. We need to implement a protocol that will allow us to + accurately describe incoming request (it's fingerprint) so it can be classified + by one of the Cells, and results of that can be cached. We need to also implement + mechanism for negative cache and cache eviction. + +1. **GraphQL and other ambigious endpoints.** + + Most of endpoints have unique sharding key: the organization, which directly, + or indirectly (via group or project) can be used to classify endpoints. + Some endpoints are ambigious in its usage (they don't encode sharding key) + or the sharding key is stored deep in payload. In such cases we need to make + decision how to handle endpoints like `/api/graphql`. + +### 5. Cell deployment + +We will run many Cells. To ease management of them we need to have consistent +deployment procedures for Cells. Which includes a way to deploy, manage, migrate, +and monitor. + +We are very likely to use tooling made for [GitLab Dedicated](https://about.gitlab.com/dedicated/) +with its control planes. + +1. **Extend GitLab Dedicated to support GCP.** +1. TBD + +### 6. Migration + +Once we reach production and we are able to store new organizations on a new Cells we need +to be able to divide big cells into many smaller ones. + +1. **Use GitLab Geo to support Cells.** + + The purpose of that would be using GitLab Geo to clone cells. + +1. **Split Cells by cloning them.** + + Once Cell is cloned we change routing information for organizations. + Organization will encode `cell_id`. When we update `cell_id` it will automatically + make the given Cell to be authoritative to handle the traffic for given organization. + +1. **Recycle data from Cells.** + + Since the organization is now stored on many cells, once we change `cell_id` + we will have to remove data from all other cells based on `organization_id`. + +## Availability of the feature + +We are following the [Support for Experiment, Beta, and Generally Available features](https://docs.gitlab.com/ee/policy/alpha-beta-support.html). + +### 1. Experiment + +Expectations: + +- We can deploy Cell on staging by using separate domain (ex. `cell2.staging.gitlab.com`) + using [cell deployment](#5-cell-deployment) tooling. +- User can create organization, group and project, and run some of [essential workflows](#2-essential-workflows). +- It is not expected to be able to run router to serve all requests under single domain. +- We are ok with data loss from stored on additional provisioned Cells. +- We expect to tear-down and create many new Cells to validate tooling. + +### 2. Beta + +Expectations: + +- We can run many Cells under single domain (ex. `stating.gitlab.com`). +- All features defined in [essential workflows](#2-essential-workflows) are supported. +- Not all aspects of [Routing layer](#4-routing-layer) are finalized. +- We expect additional Cells to be stable with minimal data loss. + +### 3. GA + +Expectations: + +- We can run many Cells under single domain (ex. `stating.gitlab.com`). +- All features defined in [essential workflows](#2-essential-workflows) are supported. +- All features of [routing layer](#4-routing-layer) are supported. +- Most of [additional workflows](#3-additional-workflows) are supported. +- We don't expect to support any of [migration](#6-migration) aspects. + +### 4. Post GA + +Expectations: + +- We support all [additional workflows]. +- We can [migrate](#6-migration) existing organizations onto new Cells. + ## Iteration plan -We can't ship the entire Cells architecture in one go - it is too large. Instead, we are adopting an iteration plan that provides value along the way. +The iterations delivered will focus on solving particular steps of a given +key work streams. + +It is expected that initial iterations will rather +be slow since they require substantionally more +changes to prepare codebase for data split. + +The single iteration describe a single quarter worth of work. + +1. Iteration 1 - FY24Q1 + + - Data access layer: Initial Admin Area settings are shared across cluster. + - Essential workflows: Allow to share cluster-wide data with database-level data access layer + +1. Iteration 2 - FY24Q2 + + - Essential workflows: User accounts are shared across cluster. + - Essential workflows: User can create group. + +1. Iteration 3 - FY24Q3 + + - Essential workflows: User can create project. + - Essential workflows: User can push to Git repository. + - Cell deployment: Extend GitLab Dedicated to support GCP + - Routing: Technology. + +1. Iteration 4 - FY24Q4 + + - Essential workflows: User can run CI pipeline. + - Essential workflows: User can create issue, merge request and merge it after it is green. + - Data access layer: Evaluate efficiency of database-level access vs API-oriented access layer + - Data access layer: Cluster unique identifiers + - Routing: Cell discovery. + - Routing: Router endpoints classification. + +1. Iteration 5 - FY25Q1 -1. Introduce organizations -1. Migrate existing top-level groups to organizations -1. Create new organizations on `cell` -1. Migrate existing organizations from `cell` to `cell` -1. Add additional Cell capabilities (DR, Regions) + - TBD ## Technical Proposals @@ -46,7 +311,7 @@ This section links all different technical proposals that are being evaluated. The Cells architecture will impact many features requiring some of them to be rewritten, or changed significantly. This is the list of known affected features with the proposed solutions. -- [Cells: Git Access](cells-feature-git-access.md) +- [Cells: Git Access](cells-feature-Git-access.md) - [Cells: Data Migration](cells-feature-data-migration.md) - [Cells: Database Sequences](cells-feature-database-sequences.md) - [Cells: GraphQL](cells-feature-graphql.md) -- GitLab From 7153b53e05d7f069d6f201c852ea3fe44273c942 Mon Sep 17 00:00:00 2001 From: Lorena Ciutacu <lciutacu@gitlab.com> Date: Fri, 28 Apr 2023 09:11:43 +0000 Subject: [PATCH 02/16] Apply @lciutacu comments --- doc/architecture/blueprints/cells/index.md | 128 +++++++++++---------- 1 file changed, 65 insertions(+), 63 deletions(-) diff --git a/doc/architecture/blueprints/cells/index.md b/doc/architecture/blueprints/cells/index.md index 20b5fef4ccb590..c3a5041b6b1380 100644 --- a/doc/architecture/blueprints/cells/index.md +++ b/doc/architecture/blueprints/cells/index.md @@ -33,54 +33,57 @@ but will be enough to run Cells in production. ### 1. Data access layer -Before Cells can be run in production we need to prepare codebase to accept Cells architecture. -This means: allowing data sharing between Cells, -updating tooling for discovering cross Cells data traversal, defining code practices -for cross Cells data traversal, analyzing data model to defining the data affinity. +Before Cells can be run in production we need to prepare the codebase to accept the Cells architecture. +This preparation involves: -Under this objective the following steps are expected to happen: +- Allowing data sharing between Cells. +- Updating the tooling for discovering cross Cells data traversal. +- Defining code practices for cross Cells data traversal. +- Analyzing the data model to define the data affinity. + +Under this objective the following steps are expected: 1. **Allow to share cluster-wide data with database-level data access layer.** - Cells can connect to database containing shared data. Example: instance settings, users, routing information. + Cells can connect to a database containing shared data. For example: instance settings, users, or routing information. -1. **Evaluate efficiency of database-level access vs API-oriented access layer.** +1. **Evaluate the efficiency of database-level access vs. API-oriented access layer.** - Re-consider consquences of database-level data access for data migration, resillency of updates and resillency of interconnected systems when we share only subset of data. + Reconsider the consequences of database-level data access for data migration, resiliency of updates and of interconnected systems when we share only subset of data. -1. **Cluster unique identifiers.** +1. **Cluster-unique identifiers** - All objects have unique identifier that can be used to access data across cluster. The IDs for allocated projecs, issues and any oher objects are cluster unique. + Every object has a unique identifier that can be used to access data across the cluster. The IDs for allocated projects, issues and any other objects are cluster-unique. -1. **Cluster wide-deletions.** +1. **Cluster-wide deletions** - Entities deleted in Cell 2, if they are cross-referenced are properly deleted or nullified across cluster. We will likely re-use existing [loose foreign keys](../../../development/database/loose_foreign_keys.md) to extend it with cross-cells data removal. + If entities deleted in Cell 2 are cross-referenced, they are properly deleted or nullified across clusters. We will likely re-use existing [loose foreign keys](../../../development/database/loose_foreign_keys.md) to extend it with cross-cells data removal. -1. **Data access layer.** +1. **Data access layer** - Ensure that a stable data-access (versioned) layer is implemented that allows to share cluster-wide data. + Ensure that a stable data-access (versioned) layer that allows to share cluster-wide data is implemented. -1. **Database migration.** +1. **Database migration** Ensure that migrations can be run independently between Cells, and we safely handle migrations of shared data in a way that does not impact other Cells. ### 2. Essential workflows To make Cells viable we require to define and support essential workflows before we can consider the Cells -to be of beta quality. Essential workflows are meant to cover majority of application functionality -that makes the product in major part useable, with possible caveats. +to be of Beta quality. Essential workflows are meant to cover the majority of application functionality +that makes the product mostly useable, but with some caveats. -The currently taken approach is to define workflows top to bottom. -The order defines presumed priority of the items. +The current approach is to define workflows from top to bottom. +The order defines the presumed priority of the items. This list is not exhaustive as we would be expecting other teams to help and fix their workflows after -initial phase where we fix the most fundamental ones. +the initial phase, in which we fix the fundamental ones. -It is expected that all features defined below are supported by Cells to consider project to Beta ready. -In below cases the workflows define a set of tables to be properly attributed by the feature. -For some it means having to break down table thats usage is ambigious. Example: `uploads` are used to store user avatars, and as well uploaded attachments +To consider a project ready for the Beta phase, it is expected that all features defined below are supported by Cells. +In the cases listed below, the workflows define a set of tables to be properly attributed to the feature. +In some cases, a table with an ambiguous usage has to be broken down. For example: `uploads` are used to store user avatars, as well uploaded attachments for comments. It would be expected that `uploads` is split into `uploads` -(as describing group/project-level attachments) and `global_uploads` (as describing ex. user avatars). +(describing group/project-level attachments) and `global_uploads` (describing, for example, user avatars). 1. **Instance-wide settings are shared across cluster.** @@ -92,11 +95,11 @@ for comments. It would be expected that `uploads` is split into `uploads` 1. **User can create group.** - The purpose is to perform targetted decomposition of `users` and `namespaces`. Since the `namespaces` will be stored cell-local. + The purpose is to perform a targeted decomposition of `users` and `namespaces`, because the `namespaces` will be stored locally in the cell. 1. **User can create project.** - The purpose is to perform targetted decomposition of `users` and `projects`. Since the `projects` will be stored cell-local. + The purpose is to perform a targeted decomposition of `users` and `projects`, because the `projects` will be stored locally in the cell. 1. **User can change profile avatar that is shared in cluster.** @@ -104,32 +107,32 @@ for comments. It would be expected that `uploads` is split into `uploads` 1. **User can push to Git repository.** - The purpose is to ensure that essential joins from projects table are properly attributed to be - cell-local, and as result essential Git workflow is supported. + The purpose is to ensure that essential joins from the projects table are properly attributed to be + cell-local, and as a result the essential Git workflow is supported. 1. **User can run CI pipeline.** - The purpose is that `ci_pipelines` (like `ci_stages`, `ci_builds`, `ci_job_artifacts`) and adjacent tables are properly attribute to be cell-local. + The purpose is that `ci_pipelines` (like `ci_stages`, `ci_builds`, `ci_job_artifacts`) and adjacent tables are properly attributed to be cell-local. -1. **User can create issue, merge request and merge it after it is green.** +1. **User can create issue, merge request, and merge it after it is green.** The purpose is to ensure that `issues` and `merge requests` are properly attributed to be `cell-local`. 1. **User can manage group and project members.** - The `members` table is properly attributed to be either `cell-local` and `cluster-wide`. + The `members` table is properly attributed to be either `cell-local` or `cluster-wide`. 1. **User can manage instance-wide runners.** - The purpose is to scope all CI Runners to be cell-local. So, instance-wide runners in fact becomes cell-local runners. The expectation would be that we provide user interface to overview and manage all runners per-cell, instead of per-cluster. + The purpose is to scope all CI Runners to be cell-local. Instance-wide runners in fact become cell-local runners. The expectation is to provide a user interface to overview and manage all runners per-cell, instead of per-cluster. 1. **User is part of organization and can only see information from the organization.** - The purpose is to have many organizations per-cell, but never have a single organization spaning across many Cells. This is required to ensure that information shown within organization are isolated and does not require fetching information from another cells. + The purpose is to have many organizations per-cell, but never have a single organization spanning across many Cells. This is required to ensure that information shown within an organization is isolated, and does not require fetching information from other cells. ### 3. Additional workflows -Those are additional workflows, that some of them might need to be supported depending on decision. +Some of these additional workflows might need to be supported, depending on the group decision. This list is not exhaustive of work needed to be done. 1. **User can use all group-level features.** @@ -143,12 +146,12 @@ This list is not exhaustive of work needed to be done. ### 4. Routing layer -The routing layer is meant to offer consistent user experience where all Cells are presented -under a single domain (ex. `gitlab.com`) instead of +The routing layer is meant to offer a consistent user experience where all Cells are presented +under a single domain (for example, `gitlab.com`), instead of having to navigate to separate domains. -User will able to use `https://gitlab.com` to access Cell-enabled GitLab. Depending -on URL access it will be transparently proxied to the correct Cell that can serve this particular +The user will able to use `https://gitlab.com` to access Cell-enabled GitLab. Depending +on the URL access, it will be transparently proxied to the correct Cell that can serve this particular information. For example: - All requests going to `https://gitlab.com/users/sign_in` are randomly distributed to all Cells. @@ -157,36 +160,35 @@ information. For example: 1. **Technology.** - We make a decision in what technology the routing service is written. + We decide what technology the routing service is written in. The choice is dependent on best performing language, and expected way and place of deployment of routing layer. If it is required to make the service multi-cloud it might be required to deploy it to CDN provider. - Then the service needs to be written using technology compatible with CDN provider. + Then the service needs to be written using a technology compatible with the CDN provider. 1. **Cell discovery.** - Routing service needs to be able to discover and monitor the health of all Cells. + The routing service needs to be able to discover and monitor the health of all Cells. 1. **Router endpoints classification.** The stateless routing service will fetch and cache information about endpoints from one of the Cells. We need to implement a protocol that will allow us to - accurately describe incoming request (it's fingerprint) so it can be classified - by one of the Cells, and results of that can be cached. We need to also implement - mechanism for negative cache and cache eviction. + accurately describe the incoming request (its fingerprint), so it can be classified + by one of the Cells, and the results of that can be cached. We also need to implement + a mechanism for negative cache and cache eviction. 1. **GraphQL and other ambigious endpoints.** - Most of endpoints have unique sharding key: the organization, which directly, + Most endpoints have a unique sharding key: the organization, which directly or indirectly (via group or project) can be used to classify endpoints. - Some endpoints are ambigious in its usage (they don't encode sharding key) - or the sharding key is stored deep in payload. In such cases we need to make - decision how to handle endpoints like `/api/graphql`. + Some endpoints are ambiguous in their usage (they don't encode the sharding key), + or the sharding key is stored deep in the payload. In these cases, we need to decide how to handle endpoints like `/api/graphql`. ### 5. Cell deployment -We will run many Cells. To ease management of them we need to have consistent -deployment procedures for Cells. Which includes a way to deploy, manage, migrate, +We will run many Cells. To manage them easier, we need to have consistent +deployment procedures for Cells, including a way to deploy, manage, migrate, and monitor. We are very likely to use tooling made for [GitLab Dedicated](https://about.gitlab.com/dedicated/) @@ -197,12 +199,12 @@ with its control planes. ### 6. Migration -Once we reach production and we are able to store new organizations on a new Cells we need +When we reach production and are able to store new organizations on new Cells, we need to be able to divide big cells into many smaller ones. 1. **Use GitLab Geo to support Cells.** - The purpose of that would be using GitLab Geo to clone cells. + The purpose is to use GitLab Geo to clone cells. 1. **Split Cells by cloning them.** @@ -217,7 +219,7 @@ to be able to divide big cells into many smaller ones. ## Availability of the feature -We are following the [Support for Experiment, Beta, and Generally Available features](https://docs.gitlab.com/ee/policy/alpha-beta-support.html). +We are following the [Support for Experiment, Beta, and Generally Available features](../../../policy/alpha-beta-support.md). ### 1. Experiment @@ -243,7 +245,7 @@ Expectations: Expectations: -- We can run many Cells under single domain (ex. `stating.gitlab.com`). +- We can run many Cells under a single domain (for example, `stating.gitlab.com`). - All features defined in [essential workflows](#2-essential-workflows) are supported. - All features of [routing layer](#4-routing-layer) are supported. - Most of [additional workflows](#3-additional-workflows) are supported. @@ -253,19 +255,19 @@ Expectations: Expectations: -- We support all [additional workflows]. +- We support all [additional workflows](#3-additional-workflows). - We can [migrate](#6-migration) existing organizations onto new Cells. ## Iteration plan -The iterations delivered will focus on solving particular steps of a given -key work streams. +The delivered iterations will focus on solving particular steps of a given +key work stream. It is expected that initial iterations will rather -be slow since they require substantionally more -changes to prepare codebase for data split. +be slow, because they require substantially more +changes to prepare the codebase for data split. -The single iteration describe a single quarter worth of work. +One iteration describes one quarter's worth of work. 1. Iteration 1 - FY24Q1 @@ -287,9 +289,9 @@ The single iteration describe a single quarter worth of work. 1. Iteration 4 - FY24Q4 - Essential workflows: User can run CI pipeline. - - Essential workflows: User can create issue, merge request and merge it after it is green. - - Data access layer: Evaluate efficiency of database-level access vs API-oriented access layer - - Data access layer: Cluster unique identifiers + - Essential workflows: User can create issue, merge request, and merge it after it is green. + - Data access layer: Evaluate the efficiency of database-level access vs. API-oriented access layer + - Data access layer: Cluster-unique identifiers. - Routing: Cell discovery. - Routing: Router endpoints classification. -- GitLab From 5d642a53a4db55859f845c74d885adfb52273c3d Mon Sep 17 00:00:00 2001 From: Lorena Ciutacu <lciutacu@gitlab.com> Date: Fri, 28 Apr 2023 09:18:38 +0000 Subject: [PATCH 03/16] Apply 1 suggestion(s) to 1 file(s) --- doc/architecture/blueprints/cells/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/architecture/blueprints/cells/index.md b/doc/architecture/blueprints/cells/index.md index c3a5041b6b1380..f96157fd1c504d 100644 --- a/doc/architecture/blueprints/cells/index.md +++ b/doc/architecture/blueprints/cells/index.md @@ -27,7 +27,7 @@ For more information about Cells, see also: We can't ship the entire Cells architecture in one go - it is too large. Instead, we are defining key work streams required by the project. -Not all objectives needs to be fullfiled to reach production readiness. +Not all objectives need to be fulfilled to reach production readiness. It is expected that some objectives will not be completed for GA, but will be enough to run Cells in production. -- GitLab From 3ba1d51936ed881d34dd51e4ee39da4e8a780e98 Mon Sep 17 00:00:00 2001 From: Christina Lohr <clohr@gitlab.com> Date: Fri, 28 Apr 2023 12:02:10 +0000 Subject: [PATCH 04/16] Grammar correction --- doc/architecture/blueprints/cells/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/architecture/blueprints/cells/index.md b/doc/architecture/blueprints/cells/index.md index f96157fd1c504d..18d6c97fcf8df9 100644 --- a/doc/architecture/blueprints/cells/index.md +++ b/doc/architecture/blueprints/cells/index.md @@ -49,7 +49,7 @@ Under this objective the following steps are expected: 1. **Evaluate the efficiency of database-level access vs. API-oriented access layer.** - Reconsider the consequences of database-level data access for data migration, resiliency of updates and of interconnected systems when we share only subset of data. + Reconsider the consequences of database-level data access for data migration, resiliency of updates and of interconnected systems when we share only a subset of data. 1. **Cluster-unique identifiers** -- GitLab From b65e09a21dc7b5af2bef2fcf7d9c7bb6cff423b7 Mon Sep 17 00:00:00 2001 From: Christina Lohr <clohr@gitlab.com> Date: Fri, 28 Apr 2023 12:03:47 +0000 Subject: [PATCH 05/16] Capitalization correction --- doc/architecture/blueprints/cells/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/architecture/blueprints/cells/index.md b/doc/architecture/blueprints/cells/index.md index 18d6c97fcf8df9..1aeaca8d18c811 100644 --- a/doc/architecture/blueprints/cells/index.md +++ b/doc/architecture/blueprints/cells/index.md @@ -57,7 +57,7 @@ Under this objective the following steps are expected: 1. **Cluster-wide deletions** - If entities deleted in Cell 2 are cross-referenced, they are properly deleted or nullified across clusters. We will likely re-use existing [loose foreign keys](../../../development/database/loose_foreign_keys.md) to extend it with cross-cells data removal. + If entities deleted in Cell 2 are cross-referenced, they are properly deleted or nullified across clusters. We will likely re-use existing [loose foreign keys](../../../development/database/loose_foreign_keys.md) to extend it with cross-Cells data removal. 1. **Data access layer** -- GitLab From c2354da1638e689ddef96388ee67037e5037b55f Mon Sep 17 00:00:00 2001 From: Christina Lohr <clohr@gitlab.com> Date: Fri, 28 Apr 2023 12:07:49 +0000 Subject: [PATCH 06/16] Grammar correction --- doc/architecture/blueprints/cells/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/architecture/blueprints/cells/index.md b/doc/architecture/blueprints/cells/index.md index 1aeaca8d18c811..68a5a21f384a90 100644 --- a/doc/architecture/blueprints/cells/index.md +++ b/doc/architecture/blueprints/cells/index.md @@ -81,7 +81,7 @@ the initial phase, in which we fix the fundamental ones. To consider a project ready for the Beta phase, it is expected that all features defined below are supported by Cells. In the cases listed below, the workflows define a set of tables to be properly attributed to the feature. -In some cases, a table with an ambiguous usage has to be broken down. For example: `uploads` are used to store user avatars, as well uploaded attachments +In some cases, a table with an ambiguous usage has to be broken down. For example: `uploads` are used to store user avatars, as well as uploaded attachments for comments. It would be expected that `uploads` is split into `uploads` (describing group/project-level attachments) and `global_uploads` (describing, for example, user avatars). -- GitLab From b79f8374474d1d2876b9d34b9d5c60a6f8568350 Mon Sep 17 00:00:00 2001 From: Christina Lohr <clohr@gitlab.com> Date: Fri, 28 Apr 2023 12:08:27 +0000 Subject: [PATCH 07/16] Grammar correction --- doc/architecture/blueprints/cells/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/architecture/blueprints/cells/index.md b/doc/architecture/blueprints/cells/index.md index 68a5a21f384a90..bb8addcb6271d2 100644 --- a/doc/architecture/blueprints/cells/index.md +++ b/doc/architecture/blueprints/cells/index.md @@ -87,7 +87,7 @@ for comments. It would be expected that `uploads` is split into `uploads` 1. **Instance-wide settings are shared across cluster.** - The Admin Area section for most part is shared across cluster. + The Admin Area section for most part is shared across a cluster. 1. **User accounts are shared across cluster.** -- GitLab From b71a5b409e01dd908cb8183a6096326de8c2d53a Mon Sep 17 00:00:00 2001 From: Christina Lohr <clohr@gitlab.com> Date: Fri, 28 Apr 2023 12:19:12 +0000 Subject: [PATCH 08/16] Grammar and capitalization updates --- doc/architecture/blueprints/cells/index.md | 46 +++++++++++----------- 1 file changed, 23 insertions(+), 23 deletions(-) diff --git a/doc/architecture/blueprints/cells/index.md b/doc/architecture/blueprints/cells/index.md index bb8addcb6271d2..35538ade72e756 100644 --- a/doc/architecture/blueprints/cells/index.md +++ b/doc/architecture/blueprints/cells/index.md @@ -95,11 +95,11 @@ for comments. It would be expected that `uploads` is split into `uploads` 1. **User can create group.** - The purpose is to perform a targeted decomposition of `users` and `namespaces`, because the `namespaces` will be stored locally in the cell. + The purpose is to perform a targeted decomposition of `users` and `namespaces`, because the `namespaces` will be stored locally in the Cell. 1. **User can create project.** - The purpose is to perform a targeted decomposition of `users` and `projects`, because the `projects` will be stored locally in the cell. + The purpose is to perform a targeted decomposition of `users` and `projects`, because the `projects` will be stored locally in the Cell. 1. **User can change profile avatar that is shared in cluster.** @@ -108,27 +108,27 @@ for comments. It would be expected that `uploads` is split into `uploads` 1. **User can push to Git repository.** The purpose is to ensure that essential joins from the projects table are properly attributed to be - cell-local, and as a result the essential Git workflow is supported. + Cell-local, and as a result the essential Git workflow is supported. 1. **User can run CI pipeline.** - The purpose is that `ci_pipelines` (like `ci_stages`, `ci_builds`, `ci_job_artifacts`) and adjacent tables are properly attributed to be cell-local. + The purpose is that `ci_pipelines` (like `ci_stages`, `ci_builds`, `ci_job_artifacts`) and adjacent tables are properly attributed to be Cell-local. 1. **User can create issue, merge request, and merge it after it is green.** - The purpose is to ensure that `issues` and `merge requests` are properly attributed to be `cell-local`. + The purpose is to ensure that `issues` and `merge requests` are properly attributed to be `Cell-local`. 1. **User can manage group and project members.** - The `members` table is properly attributed to be either `cell-local` or `cluster-wide`. + The `members` table is properly attributed to be either `Cell-local` or `cluster-wide`. 1. **User can manage instance-wide runners.** - The purpose is to scope all CI Runners to be cell-local. Instance-wide runners in fact become cell-local runners. The expectation is to provide a user interface to overview and manage all runners per-cell, instead of per-cluster. + The purpose is to scope all CI Runners to be Cell-local. Instance-wide runners in fact become Cell-local runners. The expectation is to provide a user interface view and manage all runners per Cell, instead of per cluster. 1. **User is part of organization and can only see information from the organization.** - The purpose is to have many organizations per-cell, but never have a single organization spanning across many Cells. This is required to ensure that information shown within an organization is isolated, and does not require fetching information from other cells. + The purpose is to have many organizations per Cell, but never have a single organization spanning across many Cells. This is required to ensure that information shown within an organization is isolated, and does not require fetching information from other Cells. ### 3. Additional workflows @@ -137,8 +137,8 @@ This list is not exhaustive of work needed to be done. 1. **User can use all group-level features.** 1. **User can use all project-level features.** -1. **User can share groups with other groups in organization.** -1. **User can create system web hook.** +1. **User can share groups with other groups in an organization.** +1. **User can create system webhook.** 1. **User can upload and manage packages.** 1. **User can manage security detection features.** 1. **User can manage Kubernetes integration.** @@ -161,9 +161,9 @@ information. For example: 1. **Technology.** We decide what technology the routing service is written in. - The choice is dependent on best performing language, and expected way - and place of deployment of routing layer. If it is required to make - the service multi-cloud it might be required to deploy it to CDN provider. + The choice is dependent on the best performing language, and the expected way + and place of deployment of the routing layer. If it is required to make + the service multi-cloud it might be required to deploy it to the CDN provider. Then the service needs to be written using a technology compatible with the CDN provider. 1. **Cell discovery.** @@ -181,7 +181,7 @@ information. For example: 1. **GraphQL and other ambigious endpoints.** Most endpoints have a unique sharding key: the organization, which directly - or indirectly (via group or project) can be used to classify endpoints. + or indirectly (via a group or project) can be used to classify endpoints. Some endpoints are ambiguous in their usage (they don't encode the sharding key), or the sharding key is stored deep in the payload. In these cases, we need to decide how to handle endpoints like `/api/graphql`. @@ -200,11 +200,11 @@ with its control planes. ### 6. Migration When we reach production and are able to store new organizations on new Cells, we need -to be able to divide big cells into many smaller ones. +to be able to divide big Cells into many smaller ones. 1. **Use GitLab Geo to support Cells.** - The purpose is to use GitLab Geo to clone cells. + The purpose is to use GitLab Geo to clone Cells. 1. **Split Cells by cloning them.** @@ -214,8 +214,8 @@ to be able to divide big cells into many smaller ones. 1. **Recycle data from Cells.** - Since the organization is now stored on many cells, once we change `cell_id` - we will have to remove data from all other cells based on `organization_id`. + Since the organization is now stored on many Cells, once we change `cell_id` + we will have to remove data from all other Cells based on `organization_id`. ## Availability of the feature @@ -225,10 +225,10 @@ We are following the [Support for Experiment, Beta, and Generally Available feat Expectations: -- We can deploy Cell on staging by using separate domain (ex. `cell2.staging.gitlab.com`) - using [cell deployment](#5-cell-deployment) tooling. -- User can create organization, group and project, and run some of [essential workflows](#2-essential-workflows). -- It is not expected to be able to run router to serve all requests under single domain. +- We can deploy a Cell on staging by using a separate domain (ex. `cell2.staging.gitlab.com`) + using [Cell deployment](#5-cell-deployment) tooling. +- User can create organization, group and project, and run some of the [essential workflows](#2-essential-workflows). +- It is not expected to be able to run a router to serve all requests under a single domain. - We are ok with data loss from stored on additional provisioned Cells. - We expect to tear-down and create many new Cells to validate tooling. @@ -236,7 +236,7 @@ Expectations: Expectations: -- We can run many Cells under single domain (ex. `stating.gitlab.com`). +- We can run many Cells under a single domain (ex. `stating.gitlab.com`). - All features defined in [essential workflows](#2-essential-workflows) are supported. - Not all aspects of [Routing layer](#4-routing-layer) are finalized. - We expect additional Cells to be stable with minimal data loss. -- GitLab From 7f33ac1498cb9d7f82e3bc8fcab536a91d3ed74e Mon Sep 17 00:00:00 2001 From: Christina Lohr <clohr@gitlab.com> Date: Fri, 28 Apr 2023 12:24:57 +0000 Subject: [PATCH 09/16] Grammar correction --- doc/architecture/blueprints/cells/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/architecture/blueprints/cells/index.md b/doc/architecture/blueprints/cells/index.md index 35538ade72e756..9c4ea067013af2 100644 --- a/doc/architecture/blueprints/cells/index.md +++ b/doc/architecture/blueprints/cells/index.md @@ -210,7 +210,7 @@ to be able to divide big Cells into many smaller ones. Once Cell is cloned we change routing information for organizations. Organization will encode `cell_id`. When we update `cell_id` it will automatically - make the given Cell to be authoritative to handle the traffic for given organization. + make the given Cell to be authoritative to handle the traffic for the given organization. 1. **Recycle data from Cells.** -- GitLab From 6a6077ee5deef223bf45038aac857964f24dac26 Mon Sep 17 00:00:00 2001 From: Arturo Herrero <arturo.herrero@gmail.com> Date: Fri, 28 Apr 2023 12:28:38 +0000 Subject: [PATCH 10/16] Clarify abbreviation --- doc/architecture/blueprints/cells/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/architecture/blueprints/cells/index.md b/doc/architecture/blueprints/cells/index.md index 9c4ea067013af2..80052f571dac99 100644 --- a/doc/architecture/blueprints/cells/index.md +++ b/doc/architecture/blueprints/cells/index.md @@ -28,7 +28,7 @@ We can't ship the entire Cells architecture in one go - it is too large. Instead, we are defining key work streams required by the project. Not all objectives need to be fulfilled to reach production readiness. -It is expected that some objectives will not be completed for GA, +It is expected that some objectives will not be completed for General Availability (GA), but will be enough to run Cells in production. ### 1. Data access layer -- GitLab From c79b3003d45feede08fbadf1cf8d730123b985b5 Mon Sep 17 00:00:00 2001 From: Christina Lohr <clohr@gitlab.com> Date: Fri, 28 Apr 2023 13:01:06 +0000 Subject: [PATCH 11/16] Apply 4 suggestion(s) to 1 file(s) --- doc/architecture/blueprints/cells/index.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/doc/architecture/blueprints/cells/index.md b/doc/architecture/blueprints/cells/index.md index 80052f571dac99..9da5e7857c71d1 100644 --- a/doc/architecture/blueprints/cells/index.md +++ b/doc/architecture/blueprints/cells/index.md @@ -37,8 +37,8 @@ Before Cells can be run in production we need to prepare the codebase to accept This preparation involves: - Allowing data sharing between Cells. -- Updating the tooling for discovering cross Cells data traversal. -- Defining code practices for cross Cells data traversal. +- Updating the tooling for discovering cross-Cell data traversal. +- Defining code practices for cross-Cell data traversal. - Analyzing the data model to define the data affinity. Under this objective the following steps are expected: @@ -155,7 +155,7 @@ on the URL access, it will be transparently proxied to the correct Cell that can information. For example: - All requests going to `https://gitlab.com/users/sign_in` are randomly distributed to all Cells. -- All requests going to `https://gitlab.com/gitlab-org/gitlab/-/tree/master` are always directed to Cell 5. +- All requests going to `https://gitlab.com/gitlab-org/gitlab/-/tree/master` are always directed to Cell 5, for example. - All requests going to `https://gitlab.com/my-username/my-project` are always directed to Cell 1. 1. **Technology.** @@ -236,7 +236,7 @@ Expectations: Expectations: -- We can run many Cells under a single domain (ex. `stating.gitlab.com`). +- We can run many Cells under a single domain (ex. `staging.gitlab.com`). - All features defined in [essential workflows](#2-essential-workflows) are supported. - Not all aspects of [Routing layer](#4-routing-layer) are finalized. - We expect additional Cells to be stable with minimal data loss. -- GitLab From 52dbff1b5937a023a132d25b88f34780d21d95f9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kamil=20Trzci=C5=84ski?= <ayufan@ayufan.eu> Date: Fri, 28 Apr 2023 13:02:42 +0000 Subject: [PATCH 12/16] Apply 1 suggestion(s) to 1 file(s) --- doc/architecture/blueprints/cells/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/architecture/blueprints/cells/index.md b/doc/architecture/blueprints/cells/index.md index 9da5e7857c71d1..e13e49b249e1b6 100644 --- a/doc/architecture/blueprints/cells/index.md +++ b/doc/architecture/blueprints/cells/index.md @@ -313,7 +313,7 @@ This section links all different technical proposals that are being evaluated. The Cells architecture will impact many features requiring some of them to be rewritten, or changed significantly. This is the list of known affected features with the proposed solutions. -- [Cells: Git Access](cells-feature-Git-access.md) +- [Cells: Git Access](cells-feature-git-access.md) - [Cells: Data Migration](cells-feature-data-migration.md) - [Cells: Database Sequences](cells-feature-database-sequences.md) - [Cells: GraphQL](cells-feature-graphql.md) -- GitLab From 44ec99506e4ad80e8a73d1b638af6a0b24232516 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kamil=20Trzci=C5=84ski?= <ayufan@ayufan.eu> Date: Fri, 28 Apr 2023 13:10:40 +0000 Subject: [PATCH 13/16] Clarify sentence --- doc/architecture/blueprints/cells/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/architecture/blueprints/cells/index.md b/doc/architecture/blueprints/cells/index.md index e13e49b249e1b6..20103e9b65973d 100644 --- a/doc/architecture/blueprints/cells/index.md +++ b/doc/architecture/blueprints/cells/index.md @@ -229,7 +229,7 @@ Expectations: using [Cell deployment](#5-cell-deployment) tooling. - User can create organization, group and project, and run some of the [essential workflows](#2-essential-workflows). - It is not expected to be able to run a router to serve all requests under a single domain. -- We are ok with data loss from stored on additional provisioned Cells. +- We expect data-loss of data stored on additional Cells. - We expect to tear-down and create many new Cells to validate tooling. ### 2. Beta -- GitLab From a9f65ad21c7c3edca77e3f9feff895f7c4b74217 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kamil=20Trzci=C5=84ski?= <ayufan@ayufan.eu> Date: Thu, 4 May 2023 08:32:34 +0000 Subject: [PATCH 14/16] Apply 3 suggestion(s) to 1 file(s) --- doc/architecture/blueprints/cells/index.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/architecture/blueprints/cells/index.md b/doc/architecture/blueprints/cells/index.md index 20103e9b65973d..54736ee21dc269 100644 --- a/doc/architecture/blueprints/cells/index.md +++ b/doc/architecture/blueprints/cells/index.md @@ -202,7 +202,7 @@ with its control planes. When we reach production and are able to store new organizations on new Cells, we need to be able to divide big Cells into many smaller ones. -1. **Use GitLab Geo to support Cells.** +1. **Use GitLab Geo to clone Cells.** The purpose is to use GitLab Geo to clone Cells. @@ -212,7 +212,7 @@ to be able to divide big Cells into many smaller ones. Organization will encode `cell_id`. When we update `cell_id` it will automatically make the given Cell to be authoritative to handle the traffic for the given organization. -1. **Recycle data from Cells.** +1. **Delete redundant data from previous Cells.** Since the organization is now stored on many Cells, once we change `cell_id` we will have to remove data from all other Cells based on `organization_id`. @@ -245,7 +245,7 @@ Expectations: Expectations: -- We can run many Cells under a single domain (for example, `stating.gitlab.com`). +- We can run many Cells under a single domain (for example, `staging.gitlab.com`). - All features defined in [essential workflows](#2-essential-workflows) are supported. - All features of [routing layer](#4-routing-layer) are supported. - Most of [additional workflows](#3-additional-workflows) are supported. -- GitLab From fbdc8c04ae5c5ce05cf3f6612586ffbcd485a522 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kamil=20Trzci=C5=84ski?= <ayufan@ayufan.eu> Date: Thu, 4 May 2023 10:45:44 +0200 Subject: [PATCH 15/16] More updates --- doc/architecture/blueprints/cells/index.md | 30 +++++++++++++++------- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/doc/architecture/blueprints/cells/index.md b/doc/architecture/blueprints/cells/index.md index 54736ee21dc269..961e0f2814ef97 100644 --- a/doc/architecture/blueprints/cells/index.md +++ b/doc/architecture/blueprints/cells/index.md @@ -45,7 +45,8 @@ Under this objective the following steps are expected: 1. **Allow to share cluster-wide data with database-level data access layer.** - Cells can connect to a database containing shared data. For example: instance settings, users, or routing information. + Cells can connect to a database containing shared data. For example: + application settings, users, or routing information. 1. **Evaluate the efficiency of database-level access vs. API-oriented access layer.** @@ -69,8 +70,10 @@ Under this objective the following steps are expected: ### 2. Essential workflows -To make Cells viable we require to define and support essential workflows before we can consider the Cells -to be of Beta quality. Essential workflows are meant to cover the majority of application functionality +To make Cells viable we require to define and support +essential workflows before we can consider the Cells +to be of Beta quality. Essential workflows are meant +to cover the majority of application functionality that makes the product mostly useable, but with some caveats. The current approach is to define workflows from top to bottom. @@ -79,11 +82,20 @@ This list is not exhaustive as we would be expecting other teams to help and fix their workflows after the initial phase, in which we fix the fundamental ones. -To consider a project ready for the Beta phase, it is expected that all features defined below are supported by Cells. -In the cases listed below, the workflows define a set of tables to be properly attributed to the feature. -In some cases, a table with an ambiguous usage has to be broken down. For example: `uploads` are used to store user avatars, as well as uploaded attachments -for comments. It would be expected that `uploads` is split into `uploads` -(describing group/project-level attachments) and `global_uploads` (describing, for example, user avatars). +To consider a project ready for the Beta phase, it is expected +that all features defined below are supported by Cells. +In the cases listed below, the workflows define a set of tables +to be properly attributed to the feature. In some cases, +a table with an ambiguous usage has to be broken down. +For example: `uploads` are used to store user avatars, +as well as uploaded attachments for comments. It would be expected +that `uploads` is split into `uploads` (describing group/project-level attachments) +and `global_uploads` (describing, for example, user avatars). + +Except for initial 2-3 quarters this work is highly parallel. +It would be expected that **group::tenant scale** would help other +teams to fix their feature set to work with Cells. The first 2-3 quarters +would be required to define a general split of data and build required tooling. 1. **Instance-wide settings are shared across cluster.** @@ -225,7 +237,7 @@ We are following the [Support for Experiment, Beta, and Generally Available feat Expectations: -- We can deploy a Cell on staging by using a separate domain (ex. `cell2.staging.gitlab.com`) +- We can deploy a Cell on staging or another testing environment by using a separate domain (ex. `cell2.staging.gitlab.com`) using [Cell deployment](#5-cell-deployment) tooling. - User can create organization, group and project, and run some of the [essential workflows](#2-essential-workflows). - It is not expected to be able to run a router to serve all requests under a single domain. -- GitLab From 1876f5ef131b38bccad55d7473f93773ae0d337d Mon Sep 17 00:00:00 2001 From: Christina Lohr <clohr@gitlab.com> Date: Thu, 4 May 2023 11:25:20 +0000 Subject: [PATCH 16/16] Grammar correction --- doc/architecture/blueprints/cells/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/architecture/blueprints/cells/index.md b/doc/architecture/blueprints/cells/index.md index 961e0f2814ef97..9938875adb68b9 100644 --- a/doc/architecture/blueprints/cells/index.md +++ b/doc/architecture/blueprints/cells/index.md @@ -242,7 +242,7 @@ Expectations: - User can create organization, group and project, and run some of the [essential workflows](#2-essential-workflows). - It is not expected to be able to run a router to serve all requests under a single domain. - We expect data-loss of data stored on additional Cells. -- We expect to tear-down and create many new Cells to validate tooling. +- We expect to tear down and create many new Cells to validate tooling. ### 2. Beta -- GitLab