Skip to content

Incremental backups

fredericve requested to merge fredericve/gitlab-ce:incremental-backups into master

What does this MR do?

This MR makes it possible to usethe --rsyncable option of gzip by default for the generated backup files that are created with gitlab-rake tasks via the RSYNCABLE=yes flag. From the gzip man page:

       --rsyncable
              While compressing, synchronize the output occasionally based on the input.  This increases size  by
              less than 1 percent most cases, but means that the rsync(1) program can take advantage of similari‐
              ties in the uncompressed input when synchronizing two files compressed with this flag.  gunzip can‐
              not tell the difference between a compressed file created with this option, and one created without
              it.

On top of that, the MR allows to set the existing and reused BACKUP environment variable to control the output filename of the backup. Example:

$ sudo gitlab-rake gitlab:backup:create BACKUP=dump STRATEGY=copy SKIP="repositories,artifacts,builds"
Dumping database ... 
Dumping PostgreSQL database gitlabhq_production ... [DONE]
done
Dumping repositories ...
[SKIPPED]
Dumping uploads ... 
done
Dumping builds ... 
[SKIPPED]
Dumping artifacts ... 
[SKIPPED]
Dumping pages ... 
done
Dumping lfs objects ... 
done
Dumping container registry images ... 
[DISABLED]
Creating backup archive: dump_gitlab_backup.tar ... done
Uploading backup archive to remote storage  ... skipped
Deleting tmp directories ... done
done
done
done
done
Deleting old backups ... done. (0 removed)

The use case for this is to significantly decrease the time spent and bandwidth used to transfer backups while using rsync. With the old code transferring a [TIMESTAMP]_gitlab_backup.tar file of about 18 GB over the WAN takes quite some time and the full 18 GB of bandwidth is used. With this change transferring the file is considerably faster. Example of the file transfer speedup:

$ rsync -av --progress root@git:/var/opt/gitlab/backups/dump_gitlab_backup.tar ./
receiving incremental file list
dump_gitlab_backup.tar
 18,651,648,000 100%   92.64MB/s    0:03:12 (xfr#1, to-chk=0/1)

sent 1,137,951 bytes  received 220,264,263 bytes  866,544.87 bytes/sec
total size is 18,651,648,000  speedup is 84.24

What are the relevant issue numbers?

#28074 (moved)

Does this MR meet the acceptance criteria?

Edited by Stan Hu

Merge request reports