Provide multiple git mount points so we can split NFS drives
Proposal
Proposal by @jacobvosmaer-gitlab:
- CE has all the 'shard lookup' code
- CE creates at least one non-null shard by default
- CE randomly puts new projects in the null shard (/home/git/repositories) or a non-null shard (/home/git/repositories/-shard1). This forces the CE code to correctly handle 'legacy projects' (null shard) and projects stored in a shard.
Then EE adds:
- tooling to move projects between shards
- smart (non-random) shard assignment for new projects
- etc.
Original issue
We just discussed this today in the infrastructure channel: https://chat.gitlab.com/channel/infrastructure?msg=5WKxXPJGS8LMYoX6Q
having a possibility to specify where some of the repositories are stored is also beneficial, because we can slowly migrate all repositories to a new storage, and possibly migrate away from this storage
The idea would be to have multiple mount points (which could be NFS based for now) and have the ability to set every project to a specific share, move them around using some rake task (or from the GUI) or even setting the application to use a particular share for every new project so we stop growing the same NFS server all the time.
This way we could avoid having just one SPOF (and have many, heh), but particularly we could distribute the storage of repos easily while still running the application. Splitting the heavy load we are pushing into one server into, say, 4 of them, reducing the this crazy FS load and improving availability.
This is not a final solution for the storage problem, but certainly it looks like a low hanging fruit that will buy us a lot of time.
After a quick dive in the code by @ayufan it looks like it would not be extremely hard to do it.
Can we spend a bit of time filling this idea with holes to see where will it fail?
cc/ @ayufan @jnijhof @yorickpeterse @northrup @jacobvosmaer-gitlab @DouweM