Gitaly Cluster : Outdated storages not getting synchronized
Support Request for the Gitaly Team
The goal is to keep these requests public. However, if customer information is required to the support request, please be sure to mark this issue as confidential.
This request template is part of Gitaly Team's intake process.
Customer Information
Salesforce Link: https://gitlab.my.salesforce.com/0014M00001gAMkF?srPos=0&srKp=006
Zendesk Ticket:
Installation Size:
Architecture Information:
Slack Channel: Thread in Gitaly channel
Additional Information:
Support Request
Severity
severity2 as customer instance is not down, but these repos are unavailable and blocking work
Problem Description
After clearing up some NTP related issues we are seeing ~24 repos returned as 'missing valid primary' using the dataloss command and praefect check. Could this be because they didn't stop the service while doing a backup to restore from?
Troubleshooting Performed
Standard troubleshooting documentation available online, log spelunking. Nothing outside-of-the-box, yet
What specifically do you need from the Gitaly team
Asks are:
- Is the theory around backup/restore reasonable?
- Should we expect it to rectify on its own or are their commands we can run to force reconciliation
- Are their additional checks/logs we can look into outside the standard debug process
Author Checklist
-
Customer information provided -
Severity realistically set -
Clearly articulated what is needed from the Gitaly team to support your request by filling out the What specifically do you need from the Gitaly team
/cc @mjwood @andrashorvath @jcaigitlab @john.mcdonnell @gerardo