Invert the model Gitlab.com uses for Kubernetes integration by leveraging long lived reverse tunnels
Problem to solve
Currently the Kubernetes integration for Gitlab, both on premise installations and Gitlab.com, work by having the Gitlab installation and the Gitlab runners connect directly to the Kubernetes cluster you wish to manage via Gitlab, using credentials we store inside the Gitlab installation (in the database).
For on premise installations, this simple model works relatively well, where typically the Kubernetes installation and the Gitlab instance are in the same or similar security domains, and connectivity between one and the other is relatively straightforward.
For Gitlab.com, this model has a number of drawbacks. It means that on premise uses must expose their Kubernetes clusters masters to the internet, and it also means that depending on the features they are using of Gitlab, they may have to give Gitlab what is essentially full root privileges on their Kubernetes cluster to Gitlab.com to store and use.
For some users, this risk profile is acceptable, but for others, in more highly secure environments, or with clusters in large multi-tenant environments where they don't get full privileges with the cluster, this can be a blocker to adoption of Gitlab.com Kubernetes features.
We would like to come up with an alternative solution where Kubernetes clusters masters are able to be kept from having to be exposed to the internet, and also ideally the solution would not involve Gitlab.com having to store the credential/having knowledge of the credential itself.
Intended users
- Delaney (Development Team Lead)
- Sasha (Software Developer)
- Devon (DevOps Engineer)
- Sidney (Systems Administrator)
Further details
Proposal
In order to try and provide a more secure model for Gitlab.com integration with Kubernetes installations in firewalled environments, we will look to have Gitlab.com deploy what is essentially a reverse tunnel service, and have users wishing to leverage it deploy pod inside their Kubernetes cluster with a service account attached to it, which will connect to the Gitlab.com service in order to establish a tunnel that Gitlab.com can use to connect to the clusters API.
This relies on leveraging a few key features of a default Kubernetes cluster
- The kubernetes master API is exposed internally to all Kubernetes pods at the address
kubernetes.svc.cluster.local
on port 443 - The
kubectl
tool has aproxy
function that allows it to expose a local port as a HTTP endpoint to the kube apiserver wrapping the calls to the port with the permissions used to invokekubectl proxy
. More documentation here
The harder part of this is developing a solution that allows us to provide some kind of tunnel from a pod running inside the users Kubernetes cluster, to some address inside Gitlab.com in a safe and scalable pattern.
There are a few Open Source solutions in this space we could leverage, including
- localtunnel
- tunnel-tool
-
openssh Could also be used by leveraging the
ssh -R
functionality of thessh
client. Gitlab would just need to provide some kind of ssh server that people could authenticate to and map users connections via SSH to Kubernetes clusters in Gitlab
Combining all this, it would like like something as follows (using localtunnel as an example)
- Gitlab.com runs a localtunnel service as part of its infrastructure. This service would be configured in such a way that authentication and authorization would map into the Gitlab.com permissions structure
- The user wishing to connect their cluster to Gitlab would run a pod called
gitlab-connector-pod
. They would bind a service account to the pod with the permissions they would like Gitlab.com to have over their infrastructure (at whatever level that may be) - There would be two containers in the
gitlab-connector-pod
. One (kubectl_container
) would runkubectl proxy
, which would use the service account of the pod to open port 8001 inside the pod as a proxy to the Kubernetes master atkubernetes.default.svc.cluster.local
(the clusters Kubernetes api server). The other container (localtunnel_client) would run thelocaltunnel
client software to connect to the Gitlab.comlocaltunnel
service. There would be a Kubernetessecret
object with some credentials from Gitlab.com (API token?) to allow the client to authenticate to the server, and for Gitlab.com to formally map the tunnel to a particular user/group. Thelocaltunnel
client software would then map the connection from the Gitlab.com localtunnel server to port 8001 on localhost (which is wherekubectl proxy
would be listening) - When Gitlab.com wishes to talk to the Kubernetes API for a user/group in order to obtain information or perform an an action, it would instead use the localtunnel endpoint locally that it is aware of for that cluster, talking to the localtunnel service which would then proxy the request to the localtunnel client. The localtunnel client would proxy the request to
kubectl
, which would proxy it to the Kubernetes API server itself (with credentials wrapped).
With appropriate security considerations at the Gitlab.com end, Gitlab.com can now seemlessly integrate to a Kubernetes cluster that is not exposed to the internet, and has no knowledge of the actual security credential that is used to talk to Kubernetes.