Determine how to migrate GitLab Shell (Implementation Plan)

Problem Statement

We need to determine ways for which we can migrate from the original openssh daemon to the gitlab-ssh daemon. Currently with the way the GitLab Shell is deployed we have a single deployment into each of our Zonal Kubernetes clusters. The implementation of enabling the gitlab-ssh daemon is behind a single feature flag. Implemented in gitlab/charts. The helm chart does not contain any capability to slowly roll this over, nor target X number of Pods with a differing configuration.

How do we slowly roll this out into production safely?

Solutions

Option 1

Create a second gitlab release into each of our zonal clusters with only gitlab-shell enabled which will also have the difference where the gitlab-ssh daemon is enabled. Doing so will enable us to create a second entry in our HAProxy configuration where we can utilize weights to send traffic over into the newly configured gitlab-shell slowly. The difficulty with this, is that we will have a massive undertaking to ensure appropriate observability is being handled properly and modifications required to ensure auto-deploy keeps this new release updated.

Option 2

Leverage our zonal cluster; we could re-enable the gitlab-shell on the regional cluster, enabling the gitlab-ssh daemon. Which again we can create a second entry in HAProxy and slowly send traffic to this cluster. This saves us from the auto-deploy and observability headache that Option 1 suffers from. At the expense of driving up our cloud bill temporarily (until the migration is complete). We will need to figure out how to configure the GitLab Shell to talk to the API. I suppose we can send that traffic through HAProxy's internal API FQDN.

Option 3

Make changes to our helm chart which will provide us with a second GitLab Shell deployment. This will still have all the observability headaches that Option 1 has, in addition to making changes to how our Helm chart deploys this object that needs to be temporarily supported by the Distribution team.

Option 4

...

Milestones

  • List all possible solutions
  • Discuss/Choose a proposed solution
  • Create the necessary issues to execute the move in staging and production
Edited by John Skarbek