Question: Behaviour of /gitaly.RepositoryService/RepositoryExists RPC in Gitaly cluster
Support Request for the Gitaly Team
The goal is to keep these requests public. However, if customer information is required to the support request, please be sure to mark this issue as confidential.
This request template is part of Gitaly Team's intake process.
Customer Information
https://gitlab.my.salesforce.com/00161000003RIHPAA4
https://gitlab.zendesk.com/agent/tickets/294970
Installation Size:
Architecture Information:
Slack Channel: https://app.slack.com/client/T02592416/C03JHTSTYKE/thread/CCBJYEWAW-1657634248.263949 Additional Information:
Support Request
Severity
Problem Description
Customer is exercising their test environment using GPT. They ran into issues creating the "wide" project test set, which involves creating thousands of projects.
The issue looked like gitlab#334440 (closed) / gitlab#300974 (closed)
"api_error":["{\"message\":{\"base\":[\"There is already a repository with that name on disk\",\"uncaught throw :abort\"]}}"]
Test environments tend to be subject to wholesale change which production does not see; such as volume testing followed by a complete restore from backup.
As a result, it's common/expected to find that Rails' projects.id value will be lower than the repositories created in Gitaly.
As there is no "drop all repositories" feature, analogous to the way a restore drops all elements from the Rails database, a full GitLab restore can result in issues with repositories.
Troubleshooting Performed
We advised them to perform a full backup, clean up their Gitaly storage, and then do a restore. This would sync the Gitaly storage with the rails database (ie: it'll ensure that the next projects.id project can be created
However, this still failed.
What specifically do you need from the Gitaly team
Log analysis suggests that /gitaly.RepositoryService/RepositoryExists does not look on disk, despite Rails returning "There is already a repository with that name on disk". The correlation IDs appear in the Praefect logs but not in the Gitaly logs.
Please can you confirm that this RPC is actually 100% a praefect database check when running Gitaly cluster.
This would be a variation of the issues we currently have ensuring the database and praefect nodes are in sync, eg: &7677
Author Checklist
-
Customer information provided -
Severity realistically set -
Clearly articulated what is needed from the Gitaly team to support your request by filling out the What specifically do you need from the Gitaly team