Skip to content

gitlab-backup does not restore repositories if BACKUP env variable differs from value used in gitlab-backup create

Summary

8e15b27e introduced changes to allow restoring specific server-side backups, and as a side effect made gitlab-backup restore unable to restore repositories if BACKUP=<value> is used.

Steps to reproduce

  1. Run gitlab-backup create.
  2. Run gitlab-backup restore BACKUP=latest.

What is the current bug behavior?

gitaly-backup will skip the repositories since it attempts to find the .toml filenames that match the backup ID (specified in BACKUP). For example, in the backup case, we might expect to see files such as:

# tar tvf 1704402450_2024_01_04_16.7.0-ee_gitlab_backup.tar | grep toml
-rw------- git/git         509 2024-01-04 21:07 repositories/manifests/default/@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b.wiki.git/1704402450_2024_01_04_16.7.0-ee.toml
-rw------- git/git         494 2024-01-04 21:07 repositories/manifests/default/@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b.git/1704402450_2024_01_04_16.7.0-ee.toml

However, instead of the timestamp + version combo in 1704402450_2024_01_04_16.7.0-ee, gitaly-backup expects to see latest, so it skips over the repositories.

What is the expected correct behavior?

When restoring backups, repositories are restored.

Relevant logs and/or screenshots

When restoring repositories of the backup, for every repository a message containing started restore is followed by a corresponding skipped restore warning.

{"command":"restore","gl_project_path":"gitlab-instance-e6b337f4/Monitoring","level":"info","msg":"started restore","pid":1042,"relative_path":"@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b.git","storage_name":"default","time":"2023-12-21T13:39:43.313Z"}
{"command":"restore","gl_project_path":"gitlab-instance-e6b337f4/Monitoring.wiki","level":"info","msg":"started restore","pid":1042,"relative_path":"@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b.wiki.git","storage_name":"default","time":"2023-12-21T13:39:43.317Z"}
{"command":"restore","gl_project_path":"gitlab-instance-e6b337f4/Monitoring","level":"warning","msg":"skipped restore","pid":1042,"relative_path":"@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b.git","storage_name":"default","time":"2023-12-21T13:39:43.321Z"}
{"command":"restore","gl_project_path":"gitlab-instance-e6b337f4/Monitoring.design","level":"info","msg":"started restore","pid":1042,"relative_path":"@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b.design.git","storage_name":"default","time":"2023-12-21T13:39:43.324Z"}
{"command":"restore","gl_project_path":"gitlab-instance-e6b337f4/Monitoring.wiki","level":"warning","msg":"skipped restore","pid":1042,"relative_path":"@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b.wiki.git","storage_name":"default","time":"2023-12-21T13:39:43.328Z"}

This is to be expected, as inspecting the arguments passed to gitaly-backup shows -id and the date-generated identifier, which does not exist in the original backup.

Possible fixes

The issue boils down to this line always containing a valid backup_id, even when gitlab-backup was called without the BACKUP environment variable.

  1. manager.rb:run_restore_task
  2. manager.rb:backup_id
  3. repositories.rb:restore
  4. gitaly_backup.rb:start
  5. gitaly_backup.rb:gitaly_backup_args

A potential fix would be to not make up a default backup_id when restoring a backup, but I am not sure whether it would have undesired impacts on other parts of the code.

Edited by Stan Hu