gitlab-backup does not restore repositories if BACKUP env variable differs from value used in gitlab-backup create
Summary
8e15b27e introduced changes to allow restoring specific server-side backups, and as a side effect made gitlab-backup restore
unable to restore repositories if BACKUP=<value>
is used.
Steps to reproduce
- Run
gitlab-backup create
. - Run
gitlab-backup restore BACKUP=latest
.
What is the current bug behavior?
gitaly-backup
will skip the repositories since it attempts to find the .toml
filenames that match the backup ID (specified in BACKUP
). For example, in the backup case, we might expect to see files such as:
# tar tvf 1704402450_2024_01_04_16.7.0-ee_gitlab_backup.tar | grep toml
-rw------- git/git 509 2024-01-04 21:07 repositories/manifests/default/@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b.wiki.git/1704402450_2024_01_04_16.7.0-ee.toml
-rw------- git/git 494 2024-01-04 21:07 repositories/manifests/default/@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b.git/1704402450_2024_01_04_16.7.0-ee.toml
However, instead of the timestamp + version combo in 1704402450_2024_01_04_16.7.0-ee
, gitaly-backup
expects to see latest
, so it skips over the repositories.
What is the expected correct behavior?
When restoring backups, repositories are restored.
Relevant logs and/or screenshots
When restoring repositories of the backup, for every repository a message containing started restore
is followed by a corresponding skipped restore
warning.
{"command":"restore","gl_project_path":"gitlab-instance-e6b337f4/Monitoring","level":"info","msg":"started restore","pid":1042,"relative_path":"@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b.git","storage_name":"default","time":"2023-12-21T13:39:43.313Z"}
{"command":"restore","gl_project_path":"gitlab-instance-e6b337f4/Monitoring.wiki","level":"info","msg":"started restore","pid":1042,"relative_path":"@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b.wiki.git","storage_name":"default","time":"2023-12-21T13:39:43.317Z"}
{"command":"restore","gl_project_path":"gitlab-instance-e6b337f4/Monitoring","level":"warning","msg":"skipped restore","pid":1042,"relative_path":"@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b.git","storage_name":"default","time":"2023-12-21T13:39:43.321Z"}
{"command":"restore","gl_project_path":"gitlab-instance-e6b337f4/Monitoring.design","level":"info","msg":"started restore","pid":1042,"relative_path":"@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b.design.git","storage_name":"default","time":"2023-12-21T13:39:43.324Z"}
{"command":"restore","gl_project_path":"gitlab-instance-e6b337f4/Monitoring.wiki","level":"warning","msg":"skipped restore","pid":1042,"relative_path":"@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b.wiki.git","storage_name":"default","time":"2023-12-21T13:39:43.328Z"}
This is to be expected, as inspecting the arguments passed to gitaly-backup
shows -id
and the date-generated identifier, which does not exist in the original backup.
Possible fixes
The issue boils down to this line always containing a valid backup_id
, even when gitlab-backup was called without the BACKUP
environment variable.
- manager.rb:run_restore_task
- manager.rb:backup_id
- repositories.rb:restore
- gitaly_backup.rb:start
- gitaly_backup.rb:gitaly_backup_args
A potential fix would be to not make up a default backup_id when restoring a backup, but I am not sure whether it would have undesired impacts on other parts of the code.