Skip to content

Chart upgrade to 5.3.6 has caused errors in backups

We were performing an upgrade from chart 5.0.12 to chart 5.3.6 when the backups started failing. Initially we were getting these errors:

{"command":"create","gl_project_path":"<project-namespace>/<project-name>","level":"info","msg":"started create","relative_path":"@hashed/46/a4/<hash>.design.git","storage_name":"default","time":"2022-07-29T13:43:55.774Z"}
{"command":"create","error":"manager: isEmpty: rpc error: code = Unavailable desc = connection closed","<project-namespace>/<project-name>","level":"error","msg":"create failed","relative_path":"@hashed/46/a4/<hash>.design.git","storage_name":"default","time":"2022-07-29T13:43:55.774Z"}
gitlab-gitlab-toolbox-backup-manual-lng-lc48l toolbox-backup rake aborted!
gitlab-gitlab-toolbox-backup-manual-lng-lc48l toolbox-backup Backup::Error: gitaly-backup exit status 1
gitlab-gitlab-toolbox-backup-manual-lng-lc48l toolbox-backup /srv/gitlab/lib/backup/gitaly_backup.rb:60:in `finish!'
gitlab-gitlab-toolbox-backup-manual-lng-lc48l toolbox-backup /srv/gitlab/lib/backup/repositories.rb:54:in `dump'
gitlab-gitlab-toolbox-backup-manual-lng-lc48l toolbox-backup /srv/gitlab/lib/backup/manager.rb:113:in `run_create_task'
gitlab-gitlab-toolbox-backup-manual-lng-lc48l toolbox-backup /srv/gitlab/lib/tasks/gitlab/backup.rake:25:in `block (4 levels) in <top (required)>'
gitlab-gitlab-toolbox-backup-manual-lng-lc48l toolbox-backup /srv/gitlab/vendor/bundle/ruby/2.7.0/gems/sentry-ruby-core-5.1.1/lib/sentry/rake.rb:26:in `execute'
gitlab-gitlab-toolbox-backup-manual-lng-lc48l toolbox-backup /srv/gitlab/vendor/bundle/ruby/2.7.0/gems/rake-13.0.6/exe/rake:27:in `<top (required)>'
gitlab-gitlab-toolbox-backup-manual-lng-lc48l toolbox-backup /srv/gitlab/bin/bundle:5:in `load'
gitlab-gitlab-toolbox-backup-manual-lng-lc48l toolbox-backup /srv/gitlab/bin/bundle:5:in `<main>'
gitlab-gitlab-toolbox-backup-manual-lng-lc48l toolbox-backup Tasks: TOP => gitlab:backup:repo:create
gitlab-gitlab-toolbox-backup-manual-lng-lc48l toolbox-backup (See full trace by running task with --trace)

There were two of the errors for each repo on our instance one for .design.git and one for .wiki.

We then ran the gitlab-rake gitlab:backup:repo:create command which gave us the following errors for the same elements in each repo:

{"command":"create","error":"manager: repository empty: repository skipped","gl_project_path":"<project-namespace>/<project-name>.wiki","level":"warning","msg":"skipped create","relative_path":"@hashed/5b/b8/<hash>.wiki.git","storage_name":"default","time":"2022-07-29T15:22:53.322Z"}
{"command":"create","error":"manager: repository empty: repository skipped","gl_project_path":"<project-namespace>/<project-name>","level":"warning","msg":"skipped create","relative_path":"@hashed/5b/b8/<hash>.design.git","storage_name":"default","time":"2022-07-29T15:22:53.327Z"}

The last thing we have found to try was to run Feature.disable(:gitaly_backups) which then caused these errors:

 * <project-namespace>/<project-name>.design (@hashed/5b/b8/<hash>.design) ... 
[Failed] backing up <project-namespace>/<project-name>.design (@hashed/5b/b8/<hash>.design)
Error 14:failed to connect to all addresses. debug_error_string:{"created":"@1659111574.049375641","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3093,"referenced_errors":[{"created":"@1659111574.049374721","description":"failed to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":163,"grpc_status":14}]}

This allowed the backups to finish but they did not contain any of the project data (All branches and files were missing after restore).

We then tried to complete the upgrade path to chart 6.1.2 in hopes that this was a bug that got fixed but we still get the same errors on all intermediate GitLab versions.