restoring a GitLab backup does not wipe the Git repositories - creating new projects may fail with 'There is already a repository with that name on disk'

This issue is superseded, and there's a more clearly explained workaround. See this thread.


Summary

When a GitLab backup and restore is performed, the PostgreSQL database tables, indexes, keys, etc - all known objects - are all dropped prior to the restore.

This doesn't appear to be the case for Git repos.

A customer encountered this issue:

  • GitLab backup created
  • One or more projects added to instance
  • GitLab backup restored
  • Creation of new projects failed.
The form contains the following errors:

    There is already a repository with that name on disk
    uncaught throw :abort

already-exists

GitLab team members can read more in the ticket

Workaround

Keep attempting to create projects.

This will increment the counter for the projects table primary key, and once it's incremented past the value it had previously achieved, it will stop finding the project ID in use in Gitaly, and so creation of projects will succeed.

Alternatively, increment the counter directly inside the database. Start a SQL console session and then run:

select nextval('projects_id_seq');

Steps to reproduce

  • back up GitLab.
  • create some projects
  • restore GitLab from the backup
  • try to create some projects

Hashed storage stores projects on disk in a location based on a hash of the project’s ID.

So it's a guaranteed collision as project IDs are issued sequentially.

Any backup and restore scenario where the number of projects is lower after the restore will trigger this.

Example Project

What is the current bug behavior?

The restore process does not ensure that Gitaly ends up in the same state as the backup.

What is the expected correct behavior?

The state of GitLab after the restore is completed should match the contents of the backup.

Relevant logs and/or screenshots

Two projects created between the backup and restore

154 project154
155 project155
'this form contains the following errors' - context already-exists2
logs
==> /var/log/gitlab/gitlab-rails/application_json.log <==
{"severity":"ERROR","time":"2021-06-23T15:31:38.601Z","correlation_id":"01F8WQZ35BW9Z61499B0XKVYKH","message":"Unable to save project. Error: uncaught throw :abort"}
  • gitaly
{
  "correlation_id": "01F8WQZ35BW9Z61499B0XKVYKH",
  "grpc.code": "OK",
  "grpc.meta.auth_version": "v2",
  "grpc.meta.client_name": "gitlab-web",
  "grpc.meta.deadline_type": "regular",
  "grpc.method": "RepositoryExists",
  "grpc.request.deadline": "2021-06-23T15:31:48.005Z",
  "grpc.request.fullMethod": "/gitaly.RepositoryService/RepositoryExists",
  "grpc.request.glProjectPath": "",
  "grpc.request.glRepository": "",
  "grpc.request.repoPath": "@hashed/1d/0e/1d0ebea552eb43d0b1e1561f6de8ae92e3de7f1abec52399244d1caed7dbdfa6.git",
  "grpc.request.repoStorage": "default",
  "grpc.request.topLevelGroup": "@hashed",
  "grpc.service": "gitaly.RepositoryService",
  "grpc.start_time": "2021-06-23T15:31:38.596Z",
  "grpc.time_ms": 0.183,
  "level": "info",
  "msg": "finished unary call with code OK",
  "peer.address": "@",
  "pid": 12458,
  "remote_ip": "192.168.1.68",
  "span.kind": "server",
  "system": "grpc",
  "time": "2021-06-23T15:31:38.596Z",
  "username": "root"
}
{
  "correlation_id": "01F8WQZ35BW9Z61499B0XKVYKH",
  "error": "rpc error: code = NotFound desc = GetRepoPath: not a git repository: \"/var/opt/gitlab/git-data/repositories/test/zd219094-gitalyrestore/should-be-154.git\"",
  "grpc.code": "NotFound",
  "grpc.meta.auth_version": "v2",
  "grpc.meta.client_name": "gitlab-web",
  "grpc.meta.deadline_type": "regular",
  "grpc.method": "FindDefaultBranchName",
  "grpc.request.deadline": "2021-06-23T15:31:48.005Z",
  "grpc.request.fullMethod": "/gitaly.RefService/FindDefaultBranchName",
  "grpc.request.glProjectPath": "test/zd219094-gitalyrestore/should-be-154",
  "grpc.request.glRepository": "project-",
  "grpc.request.repoPath": "test/zd219094-gitalyrestore/should-be-154.git",
  "grpc.request.repoStorage": "default",
  "grpc.request.topLevelGroup": "test",
  "grpc.service": "gitaly.RefService",
  "grpc.start_time": "2021-06-23T15:31:38.865Z",
  "grpc.time_ms": 94.369,
  "level": "info",
  "msg": "finished unary call with code NotFound",
  "peer.address": "@",
  "pid": 12458,
  "remote_ip": "192.168.1.68",
  "span.kind": "server",
  "system": "grpc",
  "time": "2021-06-23T15:31:38.960Z",
  "username": "root"
}
{
  "correlation_id": "01F8WQZ35BW9Z61499B0XKVYKH",
  "grpc.code": "OK",
  "grpc.meta.auth_version": "v2",
  "grpc.meta.client_name": "gitlab-web",
  "grpc.meta.deadline_type": "regular",
  "grpc.method": "RepositoryExists",
  "grpc.request.deadline": "2021-06-23T15:31:49.001Z",
  "grpc.request.fullMethod": "/gitaly.RepositoryService/RepositoryExists",
  "grpc.request.glProjectPath": "test/zd219094-gitalyrestore/should-be-154",
  "grpc.request.glRepository": "project-",
  "grpc.request.repoPath": "test/zd219094-gitalyrestore/should-be-154.git",
  "grpc.request.repoStorage": "default",
  "grpc.request.topLevelGroup": "test",
  "grpc.service": "gitaly.RepositoryService",
  "grpc.start_time": "2021-06-23T15:31:39.032Z",
  "grpc.time_ms": 0.083,
  "level": "info",
  "msg": "finished unary call with code OK",
  "peer.address": "@",
  "pid": 12458,
  "remote_ip": "192.168.1.68",
  "span.kind": "server",
  "system": "grpc",
  "time": "2021-06-23T15:31:39.032Z",
  "username": "root"
}
  • the repos exist on disk
# cd /var/opt/gitlab/git-data/repositories/@hashed/
# ls 1d/0e/1d0ebea552eb43d0b1e1561f6de8ae92e3de7f1abec52399244d1caed7dbdfa6.git/
branches  config  description  HEAD  hooks  info  language-stats.cache  objects  refs
# ls 21/0e/210e3b160c355818509425b9d9e9fd3ea2e287f2c43a13e5be8817140db0b9e6.git/
branches  config  description  HEAD  hooks  info  language-stats.cache  objects  refs

But in the rails console, using the documented procedure to identify the project from the hashed path

  • one that works
irb(main):010:0> ProjectRepository.find_by(disk_path: '@hashed/45/23/4523540f1504cd17100c4835e85b7eefd49911580f8efff0599a8f283be6b9e3').project
=> #<Project id:17 publicgroup/test>>
  • and the two off disk
irb(main):011:0> ProjectRepository.find_by(disk_path: '@hashed/1d/0e/1d0ebea552eb43d0b1e1561f6de8ae92e3de7f1abec52399244d1caed7dbdfa6').project
Traceback (most recent call last):
        1: from (irb):11
NoMethodError (undefined method `project' for nil:NilClass)
irb(main):012:0> ProjectRepository.find_by(disk_path: '@hashed/21/0e/210e3b160c355818509425b9d9e9fd3ea2e287f2c43a13e5be8817140db0b9e6').project
Traceback (most recent call last):
        2: from (irb):11
        1: from (irb):12:in `rescue in irb_binding'
NoMethodError (undefined method `project' for nil:NilClass)

Output of checks

Results of GitLab environment info

13.12.2-ee, self managed.

Results of GitLab application Check

Expand for output related to the GitLab application check

(For installations with omnibus-gitlab package run and paste the output of: sudo gitlab-rake gitlab:check SANITIZE=true)

(For installations from source run and paste the output of: sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true)

(we will only investigate if the tests are passing)

Possible fixes

Edited by Ben Prescott (ex-GitLab)