Workstream: Cloud Native-compatible GitLab application
Label: ~"Workstream:RequiredAppChanges"
Epic: gitlab-org&12
The GCP Migration project has a secondary goal of transforming the architecture of GitLab.com from it's current form (Omnibus installed on individually managed virtual machines) to cloud native (Helm orchestrating Kubernetes running in Google-Cloud managed GKE instance).
Much of the cloud-native effort is being driven through the Helm Charts project but is a requirement for this project too.
There are, however, a number of changes that need to be made to the GitLab application code in order to enable this.
Making the GitLab application cloud-native is crucial to the success of the helm charts project and the GCP Migration project.
Additionally proper cloud native + good helm chart support is also becoming a key feature for some prospective clients. (source: @pidge)
The changes required can be broken down into two main areas:
CI/Platform/Discussion: Direct to Object Storage
GitLab currently relies on shared filesystem access in order to operate in a cluster. On GitLab.com, we use Network File System (NFS) mounts to share volumes between server. This has some issues:
- NFS does not scale
- NFS is fragile and leads to outages
- NFS is difficult to manage through Helm and precludes the use of GKE for GitLab.com or any of our clients.
Over the past year, several efforts have been made to allow GitLab EEP to support object storage for some file types.
Several other efforts are also underway. Unfortunately most of these efforts currently rely on what could be called "indirect" object storage, which works as follows:
- File is uploaded to GitLab (via API, LFS, etc) and written to a shared volume
- File is served from the shared volume
- A cron periodically moves all files from the shared volume to the object storage and then updates all references to point to the new location in object storage.
This approach is insufficient as it still relies on a shared volume, even if only temporarily.
We need to move to a "Direct to Object Storage" approach, which involves all file uploads being sent directly to object storage to avoid shared volumes.
Related Issues
Platform
Lead: @DouweM, PM: @mydigitalself / @jramsay
- LFS Object Storage support
Discussion
Lead: @smcgivern, PM: @victorwu
-
Move Attachments to Object Storage
- https://gitlab.com/gitlab-org/gitlab-ee/issues/3905
- This may no be sufficient as the current implementation writes to local disk first, the uses a cron to move the assets to object storage
CI
Lead: @ayufan, PM: @bikebilly
-
Artifacts Direct to CI
-
Build Logs on Object Storage
-
@ayufan wrote: The plan for Artifacts and logs is as follows:
- Add multiple artifacts per-build: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/14367
- Async upload artifacts to object storage: https://gitlab.com/gitlab-org/gitlab-ee/issues/3346
- Migrate traces to be artifacts: https://gitlab.com/gitlab-org/gitlab-ce/issues/34317#note_38120715
- Make Runner to upload artifacts directly to Object Storage: https://gitlab.com/gitlab-org/gitlab-ee/issues/2348
-
@ayufan wrote: The plan for Artifacts and logs is as follows:
Platform: Improved Asset Management
As part of the asset pipeline, send assets to an object storage bucket, to be served by CDN
Currently we work-around this problem by using sticky http sessions, so that assets will be served by the same host as the application.
This will no longer be possible in the k8s world, so we need to come up with a solution
This can be done early on and will allow us to re-enable to CDN, meaning improved performance on GitLab.com and lower egress fees.
- Improved Asset Management
CI/CD: Pages
GitLab Pages also uses shared file systems in order to communicate between components, although the mechanism is quite different from file uploads.
This issue is being driven through #29 (closed).
@nick.thomas wrote a description of the problem here: #31 (comment 47023983)
the only interaction point between gitlab-rails and gitlab-pages-the-daemon right now is the filesystem mount:
shared/pages
gitlab-rails runs sidekiq jobs that rununzip
processes that change the contents of that directory, and then writes junk toshared/pages/.update
gitlab-pages polls that file and re-scansshared/pages
in the background when it changes.
Owner Unknown: Authorized Keys Command
GitLab-Shell Authorized Keys EE-to-CE port helm charts support CE installs this is not so much of an issue for the GCP Migration as we only use EE.
# Assumptions around Helm Charts for CE
Helm Charts are more difficult to develop for GitLab CE primarily because object storage is an EEP feature.
However, the GCP Migration Project does not require CE support and adding it would push the deadlines back substantially.
Current plan: https://gitlab.com/charts/helm.gitlab.io/issues/17, as discussed with @marin and @joshlambert:
Object Storage remains an EE feature, GitLab CE uses as-yet unspecified file-system sharing solution for Kubernetes (possibly Rook, minfs, glusterfs?) in order to containerise GitLab monolith.