Project 'gitlab-org/gitlab-ce' was moved to 'gitlab-org/gitlab-foss'. Please update any links and bookmarks that may still have the old path.
Stable disk paths for repositories in GitLab
The way we currently store repositories on a GitLab server is:
- there are storage shards which have a path
/home/git/repositories
- repositories have a user-provided namespace, which may contain slashes (slashes are a 9.0 feature)
foo/bar
- repositories have a user-provided name
baz.git
- total path looks like
/home/git/repositories/foo/bar/baz.git
This has a number of problems.
- if a user creates project 1 with name
baz
; renames it toqux
, and creates project 2 with namebaz
, then there are failure scenarios where the repository on disk atbaz.git
still exists and the creation of the second project fails - some filesystems impose a maximum total length on paths
/home/git/repositories/foo/bar/baz.git
and because we create paths based on user-provided strings we cannot defend well against exceeding that maximum total path length: the path to a gitlab repository has an unbounded length - to avoid very wide directory trees (10,000 namespaces? your /home/git/repositories will have 10,000 entries) you are forced to use 'storage shards'
I propose that we move to disk paths that are:
- stable over the lifetime of a project (they never change)
- have a more or less bounded length
- naturally create a reasonably balanced directory tree
Proposal 1
The storage shard path is /home/git/repositories
. Paths based on (numeric) database ID.
- project 1234 gets stored in
/home/git/repositories/12/34/1234.git
- project 123 ->
/home/git/repositories/01/23/0123.git
- project 12345 ->
/home/git/repositories/23/45/12345.git
- project 123456 ->
/home/git/repositories/34/56/123456.git
We subdivide based on the last four digits of the ID to get even directory tree balancing over time. We can tweak the magic number 4 in 'last four digits' to target a different long-term number of projects.
Questions
- do we want this?
- what should it look like?
- how do we handle the migration?