Incremental repository backups using backup.rake
backup.rake
is the primary way of creating backups for gitlab. Repository backups are created by shelling out to gitaly-backup
. gitaly-backup
has had preliminary support for incremental backups for a while now but it has yet to be integrated into backup.rake
(documented https://gitlab.com/gitlab-org/gitaly/-/blob/master/doc/gitaly-backup.md).
backup.rake
does not yet support incremental backups in general #19256
There are a few problems to solve to make this integration work:
-
Incremental repository backups must have a persistent storage. You cannot safely create a differential git backup using time alone since times on commits are freely set. So we need to know what objects were already backed up before we can create the new backup.
By default when
backup.rake
runs all the various component backups are dumped to a local filesystem staging area and then archived together in one large tar file (then optionally uploaded to object storage). Accessing a tar file to find the previous backup will likely be too slow. So there needs to be some persistent storage where all previous incremental backups are stored.This might be as simple as requiring
SKIP=tar
. It will have to work with other component backups too. Another option is to pointgitaly-backup
to its own persistent storage or cloud storage. -
There needs to be a way to differentiate a full backup run from a incremental run.
Each run of
backup.rake
generates its own backup ID. These IDs end up encoded in the name of the tar file such that a specific backup can be restored. Repository backups would need to know what the backup ID is to create an incremental backup from. Ideally for an incremental backup run, the ID would be the same as the previous run - then a previous backup would be found and we could create the new differential backup.
This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.