Clarification on the correct way to reset Gitaly Cluster on a Geo Secondary (or generally)

Support Request for the Gitaly Team

The goal is to keep these requests public. However, if customer information is required to the support request, please be sure to mark this issue as confidential.

This request template is part of Gitaly Team's intake process.

Customer Information

Salesforce Link:

Zendesk Ticket: https://gitlab.zendesk.com/agent/tickets/486509

Installation Size: ~ 3000 seats

Architecture Information:

Primary in US, secondary in EU PG replicated via RDS Object storage replicated via provider

1 praefect instance and 5 gitaly nodes in both locations

Slack Channel:

Additional Information:

Support Request

Severity

severity3

Problem Description

Customer has done a restore to primary in their sandbox environment and this has broken Geo replication for repositories.

This is suspected to be caused by a lack of any kind of reset/clear down of Gitaly on the secondary.

Troubleshooting Performed

n/a

What specifically do you need from the Gitaly team

At the moment we are seeking:

  • A way to call the RemoveAll RPC on the secondary
  • The correct method to reset Gitaly Cluster regardless of Geo status (by hand or using the RPC above)
    • Resetting a Geo secondary site replication does not appear to account for Cluster
    • It is not unusual to have customers setup environments to play around with and attempt to reset without wanting to rebuild nodes or do a restore so having this documented somewhere would be useful

There is this issue - gitlab#355535 - however the slack thread is gone and I can't see precise details in the referenced issue(s) on what was done.

Author Checklist

  • Customer information provided
  • Severity realistically set
  • Clearly articulated what is needed from the Gitaly team to support your request by filling out the What specifically do you need from the Gitaly team

/cc @mjwood @andrashorvath @jcaigitlab @john.mcdonnell @gerardo