Server side backup questions and how to restore
Support Request for the Gitaly Team
Customer Information
Zendesk Ticket: internal link
Architecture Information: The customer is currently building a new instance following the 5k Cloud Native Hybrid reference architecture
Support Request
Problem Description
The customer has couple of questions about the new server side backup:
-
Does this work on a Gitaly Cluster? The documentation suggests it.
-
Does it need separate S3 buckets per cluster node ?
-
Is the server side backup storing everything 3 times or do we just store that once?
-
Is this server side backup lighter in resources? The customer is worrying that in case of a big gitlab instance, running that from the toolbox pod might lead to some issues.
-
The restore process seems to be the same as any regular backup. Does the server side backup create one big archive file like the regular backup, or does this skip the tar step? (because if the data is streamed from each gitaly nodes to S3, I don't see how this can be merged into one big tar.gz file)
-
How does a restore of the Gitaly cluster works? Is the restoration of a server side backup different from a regular backup? Quoting the customer's question here:
So if I
gitlab-backup restorefrom a server-side backup on the VM. What exactly happens? I’d assume it downloads all the assets to the local disk (which then requires an answer on how many times we need to add the git data to know how big the disk on the node needs to be) and will then distribute everything? Or is the restore of a server side backup also triggered server side, so the server which uploaded data directly downloads the data?
-
Also I have seen no mention on how to restore a praefect node (I guess the procedure would be exactly the same as the restoration of a postgres backup, but I think we need to be very careful regarding what time the backup on the gitaly data was taken and the restore point on the postgres DB. Or what happens if I restore the gitaly cluster data form let’s say 2023-11-15 08:00 UTC and the praefect db from 2023-11-15 08:30 UTC? Will this just work? Or will praefect be broken?
What specifically do you need from the Gitaly team
If possible, answers to the above questions.
Author Checklist
-
Customer information provided -
Severity realistically set -
Clearly articulated what is needed from the Gitaly team to support your request by filling out the What specifically do you need from the Gitaly team
/cc @mjwood @andrashorvath @jcaigitlab @john.mcdonnell @gerardo