Skip to content
Snippets Groups Projects

Add GitLab.com strategy

Merged Sam Wiskow requested to merge dotcom-strategy into master
Compare and Show latest version
1 file
+ 29
5
Compare changes
  • Side-by-side
  • Inline
@@ -33,13 +33,32 @@ GitLab.com is our multi-tenant SaaS platform, where we are able to offer the mos
### Increase Platform Availability & Reliability to Best in Class
GitLab currently has an availability target (SLO) of two 9s (99.80%). However, based on 3 years of historical data from August 2021 to August 2024, we achieve an average of three 9s (99.94%) in our availability SLI. Three 9s is the high end of availability targets set for complex SaaS applications across the industry including applications like GitHub. In order to win against GitHub in the long term, we need to increase our availability to be best in class across the industry.
GitLab currently has an availability target (SLO) of two 9s (99.80%). However, based on [3 years of historical data](https://handbook.gitlab.com/handbook/engineering/monitoring/#historical-service-level-availability) from August 2021 to August 2024, we achieve an average of three 9s (99.94%) in our availability SLI. Three 9s is the high end of availability targets set for complex SaaS applications across the industry including applications like GitHub. In order to win against GitHub in the long term, we need to increase our availability to be best in class across the industry.
Best in class means putting in place an SLO of four 9s as a clear differentiator of our This will require a shared effort between many groups within SaaS platforms and some foundational investments and rearchitecture of existing systems.
### Example Subsection
### Imrpove the Quality of our Product Development Practice through Stewardship
Example Challenges subsection
As stewards of GitLab.com, Delivery, Scalability & Production Engineering have a responsibility to ensure that GitLab.com remains operational and reliable every minute of every day. Shifting more things left in the SDLC is commonly accepted practice in software development as the most efficient way to deliver higher quality production systems.
Globally improving our product and software development practice will result in a higher quality product, with fewer errors and incidents. That will enable teams to be more efficient with their time and bias work toward preventative measures rather than reactive measures post incident. In order to do this, we essentially have 2 levers:
- Manual engagement and approval with all teams building features for Gitlab - Non Scalable
- Build tools and frameworks to make the right thing the easy thing to do - Scalable
In FY 25, much of our activity as stewards requires ICs to engage with teams and onboard onto a problem in order to drive towards an optimal outcome. This has two major drawbacks, the first of which is that this process is slow and places a high burden on all team members involved to build context and understanding. The second is that, since this process is not scalable, many items that should have had review from our stewards slip through and eventually impact customers and in some cases impact customers the same way as previous incidents ([INC-18003](https://gitlab.com/gitlab-com/Product/-/issues/13406#note_1936034718), [INC-18548](https://gitlab.com/gitlab-com/gl-infra/production/-/issues/18548)).
In order to reduce the number and impact and increase the quality of our product offerings at the same time, we have to increase the level of investment in shared tools that make it easy for teams at GitLab to build quality in from the first iteration. Since taking on full responsibility for the operations of GitLab's SaaS Platforms, SaaS Platforms has gained full accountability for much of GitLab's infrastructure including provisioning, deploying, operating, logging, metrics, observability, maintenance and disaster recovery. Many of theses domains, such as logging and rate limiting, require tight integration and collaboration with teams developing GitLab features. For example rate limits are easy to implement as a feature is introduced, but become exponentially harder to introduce as a feature gains adoption.
Improvements in this area will likely be driven by having [availability metrics better reflect the user experience](https://about.gitlab.com/direction/saas-platforms/scalability/#availability-metrics-better-reflect-the-user-experience), [enabling experimental deployments](https://about.gitlab.com/direction/saas-platforms/delivery/#enable-experimental-deployments) and [release channels](https://about.gitlab.com/direction/saas-platforms/delivery/#release-channels-on-com) as well as [increasing the number of paved roads](https://about.gitlab.com/direction/saas-platforms/scalability/#paved-roads-are-the-default-for-all-team-members) for team members to traverse.
### Unlock Faster Innovation through Novel Software Delivery Models
Runway introduced a new paradigm for delivering software to customers. In FY 25 Runway supported [multi region deployments](https://docs.runway.gitlab.com/guides/multi-region/), in many locations across the world for customers that wanted to use our hosted AI services that power Duo. In FY 26, Runway will GA a new runtime, aligned with our long term vision, that will allow easy delivery of services built on Runway to Self Managed customers and expand to support more workloads across GitLab.
We expect this new paradigm to unlock new possibilites for stage teams and accelerate our time to value for new and innovative products like [GitLab Secrets Manager](https://handbook.gitlab.com/handbook/engineering/architecture/design-documents/secret_manager/). This will also drive opportunities to "Land" in categories that previously we looked to "Expand".
Along with this, GitLab.com teams will be responsible for operating and orchestrating [Cells](https://handbook.gitlab.com/handbook/engineering/architecture/design-documents/cells/infrastructure/) which presents an opportunity to further innovate on our overall multi tenant product offering. Growth in this area will likely take the form of product offerings like per cell/customer Geo, exclusive customer cells, private connections and/or private runners. Cells will be the foundation on which these new product offerings are integrated into gitlab.com and SaaS Platforms teams must design future solutions with this in mind.
### What we're not doing
@@ -52,5 +71,10 @@ Example of what we are not doing:
For a Look at the work in progress and what's next, you can stay up to date with our in progress epic and our roadmap epics:
- Thing 1
- Thing 2
- Delivery
- [Releases](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/1276)
- [Deployments](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/1277)
- Scalability
- [Runway](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/969)
- [Practices](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/1202)
- [Observability](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/1295)
Loading