[META] Move to object storage instead of direct file access and shared filesystems
Since we want to move to a cloud native schema, we need to seriously reconsider our usage of shared filesystems such as NFS or CephFS.
Currently we are using this for both git access, and for shared files such as CI build logs, uploads, etc.
As of today our only NFS file storage server is taking a lot of write load and we have no tooling to shard it beyond creating more NFS servers and splitting by root folder (uploads, logs, etc). And it's storing circa 18TB of data.
Our main access pattern is a really high write rate, and we have no way to track of manage which elements from our storage make sense to be kept alive.
It would be easier to scale our storage if we are not dealing with files directly from the application but rather we use a service to which we can talk, which supports the basic primitives of get, put, and append
Additionally, by moving a object storage we can code exponential backoffs and/or timeouts when reading these resources to prevent our NFS access from taking the application down when we have a provider problem, as we had multiple times already.
On top of this, we are currently discussing implementing file replication for DR on GEO, so this could be the right opportunity to make this change and detach from a shared posix filesystem, simplifying the containerization of the application and allowing us to scale filesystem separately from scaling the application.
cc/ @yorickpeterse @smcgivern @DouweM @rspeicher @stanhu
Related Issues
-
Make CI artifacts to be stored on object storage https://gitlab.com/gitlab-org/gitlab-ee/issues/2965 -
Store Git LFS files in object storage https://gitlab.com/gitlab-org/gitlab-ee/issues/2841 -
Store issue attachments in object storage -
Store logs in object storage

