Some RPCs through Praefect lead to always fetching start commit from remote repository
Some Gitaly RPCs allow providing a repository where to fetch a commit to base the performed operation on. Two examples of these are UserCommitFiles and ResolveConflicts but there seem to be others as well. In UserCommitFiles
, this parameter appears to be always set and usually the source and the target repositories are the same.
The original Ruby implementation checks if the storage name and the relative path match (https://gitlab.com/gitlab-org/gitaly/-/blob/a51521d682c0a424d22c4adba027030d7a8ccb4d/ruby/lib/gitlab/git/remote_repository.rb#L33) and if so, it uses the local repository instead. (https://gitlab.com/gitlab-org/gitaly/-/blob/5facf071a4b3c152e6202a0752e7c1b017dcccfe/ruby/lib/gitlab/git/repository.rb#L503). This check does not work when used in conjunction with Praefect as Praefect does not rewrite the storage name of the base commit's source repository. In an RPC, the target repository's storage may be rewritten to praefect-gitaly-0
by Praefect but the source repository's storage is left as default
or whatever is configured as Praefect's storage name in GitLab. This then fails the check if the repository is the same and leads to an additional RPC call. Failed check leads to the Gitaly calling itself through Praefect.
Example of a captured request with one storage rewritten by Praefect but the other one not, leading to an extra RPC:
{
"grpc.meta.client_name": "gitlab-web",
"grpc.request.deadline": "2020-11-13T11:39:48+01:00",
"grpc.request.fullMethod": "/gitaly.OperationService/UserCommitFiles",
"request": {
"repository": {
"storage_name": "praefect-internal-0",
"relative_path": "@hashed/3f/db/3fdba35f04dc8c462986c992bcf875546257113072a909c162f7e470e581e278.git",
"gl_repository": "project-13",
"gl_project_path": "root/repository"
},
"branch_name": "master",
"start_branch_name": "master",
"start_repository": {
"storage_name": "default",
"relative_path": "@hashed/3f/db/3fdba35f04dc8c462986c992bcf875546257113072a909c162f7e470e581e278.git",
"gl_repository": "project-13",
"gl_project_path": "root/repository"
}
}
}
I figured I'd check how this affects the performance. Interestingly, there seems to be big performance disparity between the RPC when run through Praefect or directly to a Gitaly node. This does seem like a major difference, too big to be caused by looping the RPC one extra time. This might be related to the Praefect shard serving some of the big repositories. Below graphs are from UserCommitFiles
on gprd
.