Extensible API to provision managed Kubernetes Clusters
Problem to solve
Cloud providers like Google and Digital Ocean (and probably others) would like to make it easy for GitLab users to provision managed Kubernetes clusters through GitLab's interface. The current naive approach we take is for all cloud provider specific implementation to live inside GitLab's codebase itself. This may seem straightforward and fine but we have the opportunity to take an API driven approach that could offer the following benefits:
- More cloud providers could easily add this extension with minimal support from GitLab
- Cloud providers can deliver their integration faster since they don't need to rely on getting up to speed with GitLab's codebase
- Cloud providers can own their code for the integration as opposed to GitLab which means they can iterate as much as they need to support new features they want to add (eg. serverless plugin for Google and many others)
- Cloud provider specific details can be implemented more easily by the cloud provider and kept up to date (eg. we don't yet support google network/subnetwork https://gitlab.com/gitlab-org/gitlab-ce/issues/53226 and each vendor will have different implementations here)
- GitLab (Configure team) does not need to maintain vendor specific implementations for every cloud provider we support which means that changes to integration on our side (such as Group clusers, RBAC and instance clusters) require much less effort for us and in general this will reduce the cost of lots of K8s features we plan to build soon
- We don't need to build an authentication integration with every cloud provider we connect with. For example does Digital Ocean have Oauth? Do we want to build and maintain another OAuth integration for Digital Ocean?
Further details
We currently maintain over 1600 lines of vendor specific code for our GCP integration:
$ find . -path '**/google_api/**' -type f |xargs cat| wc -l
623
$ find . -path '**/gcp/**' -type f |xargs cat| wc -l
990
Just from this one recent MR ( https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/22011/diffs ) I can see that we have a total of +580, -251
lines of code diff that relates only to google specific changes (gcp
directory) needed to support RBAC which appears to be about 66%
of the diff overall. This diff is just one of many related to RBAC that has touched google related code and we expect that group cluster work will also touch google related code too and there will be increased complexity of the work as a consequence.
From this I can deduce that if we had also implemented support for Digital Ocean then this kind of work would have been 66% more work to complete. This does not really seem sustainable so we need a better way.
Proposal
I propose we build an extensible API where vendors can easily integrate their cluster creation forms with GitLab. I imagine the following approach would provide the flexibility each provider will need:
- We configure the cloud provider (eg. Digital Ocean) as an OAuth application in instance admin settings
- API support https://gitlab.com/gitlab-org/gitlab-ce/issues/40473
- UI that shows each provider we support in the cluster creation page
- User clicks the provider they want
- We send the user over to the provider which then triggers an OAuth flow in which the cloud provider (eg. Digital Ocean) gets the OAuth token to access the user's GitLab account
- Now the user lands on a page owned by the cloud provider where they can enter the relevant details about their cluster and sign in etc as necessary
- Once cluster is finished creating the cloud provider will call into GitLab's API url provided by
gitlab_k8_api
param to create the cluster on GitLab's side - The cloud provider will get back the URL for the newly created cluster and just redirect the user back to this page
- The user lands on GitLab's page looking at the cluster page for their newly created cluster
In order for cloud providers to integrate with GitLab in future the only thing we'll need to do is configure them as an OAuth application and get the URL to link users to (and maybe a logo).
Implementation Steps
- We need API support first https://gitlab.com/gitlab-org/gitlab-ce/issues/40473
- At this point Digital Ocean should actually be able to get started with the work because we already support GitLab as an OAuth provider https://docs.gitlab.com/ee/integration/oauth_provider.html#oauth-applications-in-the-admin-area
- After they're done all we need to do is add a link in our UI to send clients to the correct page in Digital Ocean to kick off the whole process. Note this step is optional since people would be able to do this through Digital Ocean's website even before this.
What does success look like, and how can we measure that?
(If no way to measure success, link to an issue that will implement a way to measure this)