Invert the model Gitlab.com uses for Kubernetes integration by leveraging long lived reverse tunnels

Problem to solve

Currently the Kubernetes integration for Gitlab, both on premise installations and Gitlab.com, work by having the Gitlab installation and the Gitlab runners connect directly to the Kubernetes cluster you wish to manage via Gitlab, using credentials we store inside the Gitlab installation (in the database).

For on premise installations, this simple model works relatively well, where typically the Kubernetes installation and the Gitlab instance are in the same or similar security domains, and connectivity between one and the other is relatively straightforward.

For Gitlab.com, this model has a number of drawbacks. It means that on premise uses must expose their Kubernetes clusters masters to the internet, and it also means that depending on the features they are using of Gitlab, they may have to give Gitlab what is essentially full root privileges on their Kubernetes cluster to Gitlab.com to store and use.

For some users, this risk profile is acceptable, but for others, in more highly secure environments, or with clusters in large multi-tenant environments where they don't get full privileges with the cluster, this can be a blocker to adoption of Gitlab.com Kubernetes features.

We would like to come up with an alternative solution where Kubernetes clusters masters are able to be kept from having to be exposed to the internet, and also ideally the solution would not involve Gitlab.com having to store the credential/having knowledge of the credential itself.

Intended users

Further details

Proposal

In order to try and provide a more secure model for Gitlab.com integration with Kubernetes installations in firewalled environments, we will look to have Gitlab.com deploy what is essentially a reverse tunnel service, and have users wishing to leverage it deploy pod inside their Kubernetes cluster with a service account attached to it, which will connect to the Gitlab.com service in order to establish a tunnel that Gitlab.com can use to connect to the clusters API.

This relies on leveraging a few key features of a default Kubernetes cluster

The kubernetes master API is exposed internally to all Kubernetes pods at the address kubernetes.svc.cluster.local on port 443
The kubectl tool has a proxy function that allows it to expose a local port as a HTTP endpoint to the kube apiserver wrapping the calls to the port with the permissions used to invoke kubectl proxy. More documentation here

The harder part of this is developing a solution that allows us to provide some kind of tunnel from a pod running inside the users Kubernetes cluster, to some address inside Gitlab.com in a safe and scalable pattern.

There are a few Open Source solutions in this space we could leverage, including

localtunnel
tunnel-tool
openssh Could also be used by leveraging the ssh -R functionality of the ssh client. Gitlab would just need to provide some kind of ssh server that people could authenticate to and map users connections via SSH to Kubernetes clusters in Gitlab

Combining all this, it would like like something as follows (using localtunnel as an example)

Gitlab.com runs a localtunnel service as part of its infrastructure. This service would be configured in such a way that authentication and authorization would map into the Gitlab.com permissions structure
The user wishing to connect their cluster to Gitlab would run a pod called gitlab-connector-pod. They would bind a service account to the pod with the permissions they would like Gitlab.com to have over their infrastructure (at whatever level that may be)
There would be two containers in the gitlab-connector-pod. One (kubectl_container) would run kubectl proxy, which would use the service account of the pod to open port 8001 inside the pod as a proxy to the Kubernetes master at kubernetes.default.svc.cluster.local (the clusters Kubernetes api server). The other container (localtunnel_client) would run the localtunnel client software to connect to the Gitlab.com localtunnel service. There would be a Kubernetes secret object with some credentials from Gitlab.com (API token?) to allow the client to authenticate to the server, and for Gitlab.com to formally map the tunnel to a particular user/group. The localtunnel client software would then map the connection from the Gitlab.com localtunnel server to port 8001 on localhost (which is where kubectl proxy would be listening)
When Gitlab.com wishes to talk to the Kubernetes API for a user/group in order to obtain information or perform an an action, it would instead use the localtunnel endpoint locally that it is aware of for that cluster, talking to the localtunnel service which would then proxy the request to the localtunnel client. The localtunnel client would proxy the request to kubectl, which would proxy it to the Kubernetes API server itself (with credentials wrapped).

With appropriate security considerations at the Gitlab.com end, Gitlab.com can now seemlessly integrate to a Kubernetes cluster that is not exposed to the internet, and has no knowledge of the actual security credential that is used to talk to Kubernetes.

Invert the model Gitlab.com uses for Kubernetes integration by leveraging long lived reverse tunnels

Problem to solve

Intended users

Further details

Proposal

Permissions and Security

Documentation

Availability & Testing

What does success look like, and how can we measure that?

What is the type of buyer?

Is this a cross-stage feature?

Links / references