GitLab.com on Kubernetes
This epic is a collection of epics that track progress of the migration of individual components of GitLab to the Kubernetes platform. To learn more about the motivation for migrating to Kubernetes and business impact of this initiative please see the [handbook page](https://about.gitlab.com/handbook/engineering/infrastructure/production/kubernetes/gitlab-com/).
The [Kubernetes migration overview dashboard](https://dashboards.gitlab.net/d/delivery-k8s_migration_overview/delivery-kubernetes-migration-overview?orgId=1&refresh=5m) gives an overview of the available services and compares the virtual machines to the Kubernetes cluster. As the migration continues we'll see the number of VMs reduce as the number of pods increase.
### Status 2022-12-12
The work of migrating Gitlab components and services to run on Kubernetes has been concluded.
The list of services migrated is available [here](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/112#timeline-of-migrated-services).
On the `2022-11-14` [Kubernetes Migration WG](https://docs.google.com/document/d/1dbJZNAiTVvwJ9ICu10FpxP9AaAVDXDVkATmpzSONztE/edit#bookmark=id.ptm4neleb7q8) was taken the decision to not migrate, in their current form, Gitaly and Praefect, in light of the new [Gitaly Cluster Architectural](https://docs.google.com/document/d/13dTh0AGCHjM9BSf80koqtUSLELUiZfLd7NWrx7m6NOE/edit#heading=h.rkparaok4fk3) direction. This decision will be revisited at the end of Q2 (July 2023).
Note: the PostgreSQL infrastructure (including Patroni, PGBouncer, etc.) was not part of the scope of this migration.
In addition to the migrated services, we completed [several works](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/112#completed-epicsissues) to support and enable the migrations.
With the above achieved, we are closing this epic and considering this effort completed. Thanks everybody for the hard work to accomplish this in the last three years.
### :white_check_mark: Completed
#### Timeline of migrated services
| Service | Change Issue | Completed | Summary |
|-----------------------|-----------------------------------------------------------------|------------|---------|
| Migrate sidekiq queues | https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/89 | See https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/447 |
| registry | https://gitlab.com/gitlab-com/gl-infra/production/-/issues/1074 | 2019-08-30 | |
| PlantUML | https://gitlab.com/gitlab-com/gl-infra/production/-/issues/1198 | 2019-09-27 | |
| mailroom | https://gitlab.com/gitlab-com/gl-infra/production/-/issues/1383 | 2019-11-21 | |
| memory-bound | https://gitlab.com/gitlab-com/gl-infra/production/-/issues/2058 | 2020-05-30 | |
| elasticsearch | https://gitlab.com/gitlab-com/gl-infra/production/-/issues/2144 | 2020-05-15 | |
| low-urgency-cpu-bound | https://gitlab.com/gitlab-com/gl-infra/production/-/issues/2244 | 2020-06-09 | |
| urgent-other | https://gitlab.com/gitlab-com/gl-infra/production/-/issues/2254 | 2020-06-16 | |
| database-throttled | https://gitlab.com/gitlab-com/gl-infra/production/-/issues/2378 | 2020-07-07 | |
| gitaly-throttled | https://gitlab.com/gitlab-com/gl-infra/production/-/issues/2378 | 2020-07-07 | |
| urgent-cpu-bound | https://gitlab.com/gitlab-com/gl-infra/production/-/issues/2360 | 2020-07-14 | Currently running 30 pods with the ability to scale up to 84 and seeing similar performance to the VMs. Full analysis post-migration https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/262#note_379202184 |
| websockets | https://gitlab.com/gitlab-com/gl-infra/production/-/issues/2702 | 2020-08-15 | |
| Git https | https://gitlab.com/gitlab-com/gl-infra/production/-/issues/2818 | 2020-10-26 | |
| Git ssh | https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/229 | 2020-12-01 | |
| Websockets (interactive terminal and Actioncable) | | 2021-02-24 |
| API Service | https://gitlab.com/gitlab-com/gl-infra/production/-/issues/4577 | 2021-06-28 |
| Observability and troubleshooting to support web migration | https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/340 | 2021-07-22 | |
| Web traffic on Kubernetes | https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/272 | 2021-09-23 | Migration of the web-fleet into Kubernetes |
| Pages | https://gitlab.com/gitlab-com/gl-infra/production/-/issues/5915 | 2021-12-01 | |
| Camoproxy | https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/90 | 2022-08-05 | |
#### Completed Epics/Issues
|Topic| Epics/Issues | Note |
|-----|--------------|------|
|Migrate project_export Sidekiq queue | &143 | |
| Enable auto-deploy | &125 & &149 | |
| Mailroom Migration | https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/95 | |
| Container Registry Migration | https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/70 | |
|Pre-Migration tech debt cleanup| https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/79 | |
| Secrets SSOT between Kubernetes and Chef| https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/167 | |
| Stabilize auto-deploy for Kubernetes | https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/217 | |[Board](https://gitlab.com/gitlab-com/gl-infra/delivery/-/boards/1696436) | |
| Git https nodes on Kubernetes | https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/228 | |
| Helm 3 Upgrade | https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/370 | |
| Enable GitLab Kubernetes Agents (KAS) in Production | https://gitlab.com/gitlab-org/gitlab/-/issues/249596 | |
| Websockets (interactive terminal and Actioncable) | https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/355 | |
| API traffic on Kubernetes | https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/271 | |
| Reduce the risk of rolling out K8s-workloads config changes| https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/538 | |
| Web traffic on Kubernetes | https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/272 | |
| Upgrade to Helm 3.7 | https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/595 | |
| Pages | https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/273 | |
| Remove All external data sources from gitlab-com (excluding auto-deploy versions) | https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/674 | |
| Modify Kubernetes deployment workflow | https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/582 | |
| Running kubectl and helm against production systems | https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/263 | |
### :anchor: Other services/components on VMs not part of the migration
| Service/Component | Notes |
|-------------------|-------|
| Redis | Provisioning new deployments happening on Kubernetes. |
| HAProxy | |
| VMs Runner Manager | |
| Praefect | |
| Patroni | |
| PGBouncer | |
| Postgres Database Servers | |
| Monitoring Infrastructure | Only a part of it |
| Consul | We have client side agents on Kubernetes, but not the Consul cluster servers |
| Console nodes | |
| Deploy nodes | |
### :books: Important links and Migration Demos
* [k8s-workloads/gitlab-com](https://gitlab.com/gitlab-com/gl-infra/k8s-workloads/gitlab-com): Contains the GitLab.com configuration for the [GitLab helm chart](https://gitlab.com/gitlab-org/charts/gitlab).
* [k8s-workloads/gitlab-helmfiles](https://gitlab.com/gitlab-com/gl-infra/k8s-workloads/gitlab-helmfiles/): Contains the configuration for all namespaces outside of GitLab, this includes logging, monitoring, etc.
* [gitlab-com-infrastructure](https://gitlab.com/gitlab-com/gitlab-com-infrastructure/): Terraform configuration for the cluster, all resources necessary to run the cluster are configured here including the cluster, node pools, service accounts and IP address reservations.
* [Kubernetes migration overview dashboard](https://dashboards.gitlab.net/d/delivery-k8s_migration_overview/delivery-kubernetes-migration-overview?orgId=1&refresh=5m) gives an overview of the available services and compares the virtual machines to the Kubernetes cluster.
* [GitLab.com migration to k8s demos Youtube playlist](https://www.youtube.com/playlist?list=PL05JrBw4t0KoSVfGsIL3sES4QjQQTpc63)
epic