Assign a tag to Gitaly storage
Problem to solve
On GitLab.com we want to keep certain projects isolated from others to meet SLAs, and move low activity projects to less powerful nodes. This needs to be handled manually currently. Any improvements to make this automated, require a system in GitLab to represent these in a basic sense.
Further details
Some customer's may have additional requirements for where a repository may reside (i.e. which storage location). These requirements may have complex rules, such as:
- Repo1's primary replica cannot live on the same storage as Repo2's primary replica
- All replicas of Repo1 should reside on SSD
- By default, all repos should always reside on cheap storage
Ideally, we don't want to open up the praefect internals to admins any more than necessary. In this case, the undesired internals are the actual storage locations that Praefect manages. Allowing a customer to specify the storage locations for a repo's replicas undermines Praefect's ability to swap our storage locations to satisfy high availability requirements.
One alternative strategy would be to allow infrastructure teams to label/tag the storage nodes they configure for HA. For example, a storage node could have 0 or more tags like: [SSD, cheap, expensive, fast, slow]
Repos could have corresponding rules that allow Praefect to make decisions about which storage locations are appropriate for each repo.
Proposal
A GitLab administrator should be able to configure the storage tag for Gitaly nodes from /etc/gitlab/gitlab.rb
- There will need to be a way for the list of valid storage tags to be provided to the GitLab interface to allow gitlab#216221
The purpose of the storage tag will be to ensure that when repositories are moved from shard to shard, the target shard has a matching tag.
For example, projects with a marquee tag can only live on a shard with the marquee. This will prevent it being moved to one of the shards for stale or archived projects.