Discover Cells 1.0 impact for Group IDE (investigation)
Description
Hello team, as many of you already knew, the Cells project is one of the top priorities for FY2025, with the goal of providing additional scalability for GitLab.com. It is a very large and highly complex project that will take multiple iterations. As the first iteration, the Cell project team proposed the design of Cells 1.0, with a focus on new enterprise customers. This iteration will allow us to get hands-on experience deploying and operating Cells sooner and be a big step towards our ultimate goal of alleviating scalability pressure on gitlab.com SaaS.
The Cells 1.0 design and execution plan were discussed in the Engineering FY23Q4 Offsite and received approval. Our plan is to have it in production by mid of FY 2025.
Cell 1.0 requires the whole company to participate in this project and make respective features and workflows compatible with Cells 1.0.
Below are the details that are helpful for all engineering groups to start looking at their technical domains.
Web IDE (investigation complete)
-
Web IDE editing works -
Web IDE commit to main branch works -
Web IDE commit to a new main branch and jump to an MR works -
Web IDE behavior with multiple URLs (if cells requires it) does not work - ~ Web IDE behavior with code suggestions untested; handled here Discover Cells 1.0 impact for code_creation (#434978 - closed)
Remote development (blocked on KAS installation)
-
KAS setup does not work on the POC &12474 (comment 1762648334)+ -
Test workspace creation and access, change, -
Test workspaces pull and push code commit, create branch etc -
Test workspace - other operations -
Test accessing a workspace through its URL (over HTTPS) -
Test accessing a workspace through SSH
Discoveries
- KAS does not work Team environment discussions #434990 (comment 1751605838)
What we are asking for
We are asking all product teams to take the following steps:
- Follow the guidelines in this Discovery Epic to see how the Cells 1.0 architecture affects your domain:
-
Review Cells 1.0 design, and identify the workflows in your product area that would be affected by Cells 1.0. -
Test your workflows against the Cells 1.0 POC. -
There is an issue for each team in the Discovery Epic. You can use this issue to discuss the implications of Cells 1.0 with your group, or involve Tenant Scale engineers if you need more clarification. -
If you discover workflows that need to be fixed to be compatible with Cells 1.0, open sub-epic for the workflows in the Cell 1.0 Epic and break down the effort into issues. -
Estimate efforts (T-shirt sizing, we are NOT asking for a detailed design and work estimate), and provide a high level timeline (take into account resource allocation) in the sub-epic created above. We know there are a lot of unknowns, so try your best to give a rough initial estimation. The goal here is to allow us to identify risk areas and address them early.
-
- Add sharding keys to all your cell local tables, see the documentation and progress dashboard.
- All tables with the following gitlab_schema are considered “cell-local”:
gitlab_main_cell
gitlab_ci
- All tables with the following gitlab_schema are considered “cell-local”:
- Classify your tables under the different database schemas (
gitlab_main_cell
orgitlab_main_clusterwide
), see the documentation and progress dashboard.
We are working to automate adding the sharding keys (2) and classifying the tables (3), but we won’t be able to automatically classify everything. So if you want to be proactive, we won't need to contact you again in case we can't automate the work.
The Cell Project team will work out an overall project timeline with a target end of July production date. Please take into account resource allocation when doing your timeline estimation.
We are also preparing an AMA to explain the details of the ask.
If your team already completed a Cells discovery issue before this, we still request that you re-evaluate your workflows with the Cells 1.0 scope in mind.
Tips for Discovery
Read the Cells 1.0 design for the plan of the first iteration. Ask the following questions and discover incompatibilities in your features following the clue from the answer -
- Does your team use or require admin accounts on GitLab.com?
- Does your team require direct database access in a form of Rails console, database replica?
- Does your team have a service that it manages that is not part of Omnibus or Cloud Native deployment?
- Does your team provide instance-wide features in the product? Ex.: shared runners, GitLab Pages, etc.
- Does your team use or implement authentication features? Ex.: login via access token, Deploy Tokens, Kubernetes Agent connecting to GitLab.
Relevant links
- Cells 1.0 Executive Summary
- Cells 1.0 Architecture Design
- Discovery Epic - follow instruction in it to discover if any work needed to make your product compatible with Cells 1.0
- Cells 1.0 Epic - create sub-epics and issues to track work you identified
- Cells 1.0 workstreams slides and video walkthrough
Timeline of Immediate Tasks
- 2023-01-31 - Two AMAs (APAC/EMEA and AMER/EMEA friendly) and PoC is sent out to teams.
- 2023-02-06 - Due date for all engineering groups (development, infrastructure, test platform) finish breakdown and work estimate.
- 2023-02-09 - First version of Cell 1.0 Schedule finished and communicated, with the understanding that we may have more iterations in the future.
Status
- 2024-05-16: Local dev cells environment setup and running, but have questions about how to do local testing with non-default IP. Asked on slack: https://gitlab.slack.com/archives/C0609EXHX6F/p1715920142688619