Product discovery for GitLab PaaS
Problem to solve
Remove the complexity of provisioning and maintaining the infrastructure necessary for developing and launching an application.
See https://gitlab.com/gitlab-org/gitlab-ce/issues/32731
We want to provide compute resources for a project in a "just-in-time" fashion. As soon as we see the project would benefit from using this, we spin it up for the user and manage it for them.
As part of this discovery we want to determine which is the best solution to provide PaaS for GitLab users.
Target audience
developers
Further details
Kubernetes multi-tenancy has built-in benefits that can be used for this.
Knative also has built in idling and ephemeral compute that can be used for this purpose.
Proposal
- Use a shared cluster with namespace-to-tenant (where there's a 1:1 between project and tenant) to provide compute for multiple, un-related projects. Use the k8s multi-tenant primitives to enforce security, such as:
- PodSecurityPolicy(beta) reference
- NetworkPolicy
- PolicySpec
- PolicyEnforcer
- ResourceRequests
- Limit
- ResourceQuota
- SchedulingPolicy
- SecurityProfile
- Use knative deployments for each tenant to efficiently use resources.
What does success look like, and how can we measure that?
Tasks
-
Investigate feasibility of large, shared, multi-tenancy cluster (>1000 nodes) -
Scalability -
Performance -
Cost
-
-
Investigate security of multi-tenancy cluster -
Investigate gVisor to isolate container -
Investigate in cluster policies (see https://gitlab.com/tkuah/k8s-mt-test for experiments) -
Disable Kubernetes API access from tenants -
Disable Kubernetes API access from pods
-
-
Work out how to deploy to shared cluster (cli?) -
Consider Service Catalog https://kubernetes.io/docs/concepts/extend-kubernetes/service-catalog/ -
Consider Knative https://github.com/knative/docs/tree/master/serving -
ability to add/define shared cluster to GitLab
-
Links / references
K8S multi-tenancy https://docs.google.com/document/d/1mNL5oCIqtVwXI9piTPMuGArdZH8CA2UFaxHtM5Myp6M/edit# https://docs.google.com/presentation/d/1dsAsVm8kCA9Dx9_gMEYeJL7pduAbnfnxT9lhbyCvHDg/edit#slide=id.p1
Internal discovery doc https://docs.google.com/document/d/1cSsXaGG6vg1_VSnxheoOTHx8UzTtCr2Yzhdhcpyj6ys/edit
Conclusions
In order to use a shared, multi-tenant cluster we must solve the scalability and security challanges first and foremost. To tackle those we've decided to:
- Use Knative and its native scaling features to use resources efficiently.
- Isolate containers from one another for increased security making use of gVisor
- Provide each tenant with an isolated namespace
- Enforce namespace separation
- Disable Kubernetes API access from tenants
- Disable Kubernetes API access from pods
- Provide a built-in paas deployment definition to
gitlab-ci.yml
so users can quickly get up and running - Extend Auto DevOps so that it make use of a dockerfile as a deployment artifact (in addition to the current helm chart way)
- The complexity of providing in-cluster DB will delay MVC and thus we will offer only DB via a managed service.
- Provide data storage outside of the cluster, via a managed service
- Allow self-managed instances the ability to specify more than 1 shared cluster for redudancy
- Carry out POC internally only to evaluate our assumptions
This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.