Add ability to skip tar creation to backup-utility
Proposal
Currently the backup-utility
does not provide the user a way to skip the tar
process when creating/restoring backups.
This feature is desired as the tar
process was identified as a culprit to pods that were being evicted during the backup process due to:
Message: The node was low on resource: memory
Such a scenario was observed with a user with a very big GitLab instance.
Previous spike issue
gitlab-org/distribution/team-tasks#1182 (comment 1333407160)
Techinical difficulties
We have 2 separate tools for the same job
Right now we have two backup/restore tools:
-
backup-utility
: used by the GitLab chart to trigger schedules backups via cron jobs, or manual backups via the toolbox pod. - GitLab rake tasks: used by Omnibus/self-compiled instances to trigger backups.
Both of them have their own code to deal with backup/restore. For example, both have different code snippets to deal with tar
, so some code solutions might required duplicated work to implement it on both places.
In some cases, the backup-utility
triggers the existing rake tasks. For instance, when it does the database and repository backups. Ideally, we'd like the backup-utility
to delegate all it's features to GitLab rake tasks. The work to implement this code/feature-parity unification is being tracked at: gitlab-org/charts/gitlab#1127
Untarred files are currently not identified
Simply supporting SKIP=tar does not work because the untarred version of the backup is not identified with a backup id, so when pushing an Object Storage we'd be overriding the previous backup, which is not desired.
Ideas
-
Create untarred backups in named directories (#362981) aims to solve the unidentifiable untarred backup name. This is a good starting point, as it would make the top-level untarred folder to be identifiable and also storable in an object storage. Still, further work needs to be done to support pushing/downloading the folder directly. Also, the issue does not cover skipping tar for all of the GitLab components (artifacts, uploads, builds, etc), which we should also look into. Finally, this work we'll have to be done for the
backup-utility
and for the GitLab rake tasks, if we don't work in unifying thebackup-utility
and rake tasks code logic and feature-parity. - Alternatively instead of duplicating it, we could just support using the rake task for users who don't need the object storage. Still, the
backup-utility
will have to support storing the backups somewhere, and it can't be a pod. So maybe this will required having a specific persistent volume for this purpose.