Question: Behaviour of /gitaly.RepositoryService/RepositoryExists RPC in Gitaly cluster

Support Request for the Gitaly Team

The goal is to keep these requests public. However, if customer information is required to the support request, please be sure to mark this issue as confidential.

This request template is part of Gitaly Team's intake process.

Customer Information

https://gitlab.my.salesforce.com/00161000003RIHPAA4

https://gitlab.zendesk.com/agent/tickets/294970

Installation Size:

Architecture Information:

Slack Channel: https://app.slack.com/client/T02592416/C03JHTSTYKE/thread/CCBJYEWAW-1657634248.263949 Additional Information:

Support Request

Severity

severity3

Problem Description

Customer is exercising their test environment using GPT. They ran into issues creating the "wide" project test set, which involves creating thousands of projects.

The issue looked like gitlab#334440 (closed) / gitlab#300974 (closed)

"api_error":["{\"message\":{\"base\":[\"There is already a repository with that name on disk\",\"uncaught throw :abort\"]}}"]

Test environments tend to be subject to wholesale change which production does not see; such as volume testing followed by a complete restore from backup.

As a result, it's common/expected to find that Rails' projects.id value will be lower than the repositories created in Gitaly.

As there is no "drop all repositories" feature, analogous to the way a restore drops all elements from the Rails database, a full GitLab restore can result in issues with repositories.

Troubleshooting Performed

We advised them to perform a full backup, clean up their Gitaly storage, and then do a restore. This would sync the Gitaly storage with the rails database (ie: it'll ensure that the next projects.id project can be created

However, this still failed.

What specifically do you need from the Gitaly team

Log analysis suggests that /gitaly.RepositoryService/RepositoryExists does not look on disk, despite Rails returning "There is already a repository with that name on disk". The correlation IDs appear in the Praefect logs but not in the Gitaly logs.

Please can you confirm that this RPC is actually 100% a praefect database check when running Gitaly cluster.

This would be a variation of the issues we currently have ensuring the database and praefect nodes are in sync, eg: &7677

Author Checklist

  • Customer information provided
  • Severity realistically set
  • Clearly articulated what is needed from the Gitaly team to support your request by filling out the What specifically do you need from the Gitaly team

/cc @mjwood @andrashorvath