Storage on Kubernetes

Cloud native storage

As we proceed on our journey to being more cloud native, we are identifying some hurdles and challenges, in particular that there is still a need for common shared storage for most types of content. For example, an API node and a Unicorn node may require access to the same build log or artifact.

This raises the challenge of supporting a shared storage service within Kubernetes, which is not available OOTB. To help guide us in our decision, a few main goals:

  • Deploying GitLab should continue to be as easy as helm install gitlab. In other words, it should be able to be deployed to Kubernetes without external requirements.
    • For example, we shouldn't require an external PG DB, or some other external storage solution.
  • Running a turnkey GitLab HA cluster should continue to require EEP, as it is one of the major value drivers historically for this edition.

Product capabilities

There are a few things to keep in mind about the product as well:

  • Object storage is desirable for large scale deployments, as there are additional benefits for Geo as well
  • Object storage is EEP only
  • Presently object storage still requires local shared storage, as a background job lazily moves the content up to object storage later.

Because of the above, we will go with object storage for EEP deployments. This is also what GitLab.com will be running on.

  • This does mean a third party object storage system becomes a requirement for EEP, unless we package our own via Minio or Rook.

This leaves us with a decision on what to do for CE and EES.

Considerations

This introduces a few key questions:

  • How do we handle CE with a shared storage requirement?
    • Getting easier and easier to run HA, especially with tools like Aurora, ElastiCache, etc. Shared storage was main outlier in the past.
  • If not object storage, then what should be used?
    • Object storage and NFS both do not ship with k8s
    • Each needs to be configured early in configuration phase
  • What does the "upgrade" path look like from CE/EES to EEP?
  • What do we do to eventually provide our own object storage for EEP?

Options

NFS-like option for CE/EES

For CE and EES instances (which do not have object storage support) we can consider using an NFS like solution within Kubernetes:

  • For OOTB support, one solution that can provision shared volumes with vanilla Kubernetes (no modification of cluster) is Rook. This is still early in development, but remains an option.
  • We can then also support externalizing an NFS share, for someone who wants to bring their own.

This may be enough to provide a workable solution for smaller scale GitLab installations. However we would still need to "own" this solution, and support it. This could cause considerable effort, depending on its stability. The Rook project however does seem quite active, with thousands of lines of code added every week: https://github.com/rook/rook/graphs/code-frequency.

Move object storage to CE

Another option is to simply move object storage support to CE. This however will mean that it becomes very easy to run HA clusters with CE, and will erode a key value prop of EEP. (High Availability)

This also means we would need to provide an OOTB solution for object storage within k8s cluster, without relying on external services.

Separate charts for CE/EES and EEP

Another option is to restrict the full cloud native chart for EEP users only. CE and EES users can continue to use the existing monolithic container/chart.

  • Does not push CE towards easy HA
  • Debatable extra work, as maintaining a Rook-like NFS service is also significant overhead
  • Means we would continue to maintain current chart.
  • Upgrades to EEP/CN is a reinstall and migration.

Suggested interim solution - cloud native chart EEP only (for now)

While we work on a plan for what to do with CE and EES, I'd recommend we push forward with our plans to leverage object storage with EEP+ customers. This is what we need for GitLab.com, and we want to get this chart out in a usable form for testing ASAP.

We will keep in mind the surrounding needs of CE/EES as we continue to make decisions, but won't actively work on a CE oriented solution at this time.

As we make progress here, we can keep an eye on projects like Rook and Minio:

  • Provides an extra couple of months to stabilize further
  • Rook in particular, with NFS like storage, is an interesting candidate

If Rook, or something like it, provides a stable enough NFS service it can then act as the baseline storage solution for all editions. This means that object storage becomes optional (but strongly recommended) for EEP, and can be easily enabled at any point in time. It also means that object storage is no longer a hard requirement for the initial chart.

Current plan for Q4 beta MVP

  • All content except git repos on object storage
  • Repos continue to live on locally attached disks with gitaly
  • Requirement for EEP license to be passed to Helm for installation, to remove need for any shared storage. (GitLab.com will not have NFS)
  • Packaged NFS/object storage via Rook or Minio is left for future work. Requirement for third party object storage service for initial release.
Edited by silv