Some projects can not be removed after upgrading to GitLab 14.6

Summary

After switching to GitLab 14.6 some projects can not be removed. Sometimes projects deletion successes, sometimes it is not. When project deletion fails, a project remains in "No repository" state and can not be removed.

Steps to reproduce

This is not that easy to reproduce this issue. The issue in not reproducible with GitLab.com.

I can reproduce this with a fresh omnibus installation by doing the following steps programmatically (via API on behalf of root):

  1. Install GitLab 14.6 from scratch to a Docker container. Use the following non-standard settings (not sure whether they matter):
    • sidekiq['concurrency'] = 2
    • puma['worker_processes'] = 2
    • prometheus_monitoring['enable'] = false
    • grafana['enable'] = false
    • postgresql['shared_buffers'] = '1MB'
  2. Create an empty project having a random name.
  3. Make a single commit to master (master is a default name in this installation):
    • file_path='test/file/path1'
    • content='test_content1'
    • commit_message='[ci skip] Initial commit.'
  4. Create a branch right from the master.
  5. Wait 10 seconds.
  6. Initiate project deletion via API.
  7. Wait 10 seconds.
  8. Enumerate all available projects via API. For some reason one project will be available in a response.
  9. This only project will be in "No repository" state. Any attempt to access its commits via API will end up with Internal server error.
image
{
    'id': 2, 'description': None, 'name': 'dafd2f908dfc476997ba62e7933eb47f', 'name_with_namespace': 'Administrator / dafd2f908dfc476997ba62e7933eb47f',
    'path': 'dafd2f908dfc476997ba62e7933eb47f', 'path_with_namespace': 'root/dafd2f908dfc476997ba62e7933eb47f',
    'created_at': '2021-12-30T10:31:29.247Z',
    'default_branch': 'master', 'tag_list': [], 'topics': [], 'ssh_url_to_repo': 'git@localhost:root/dafd2f908dfc476997ba62e7933eb47f.git',
    'http_url_to_repo': 'http://localhost:8888/root/dafd2f908dfc476997ba62e7933eb47f.git',
    'web_url': 'http://localhost:8888/root/dafd2f908dfc476997ba62e7933eb47f', 'readme_url': None, 'avatar_url': None, 'forks_count': 0, 'star_count': 0,
    'last_activity_at': '2021-12-30T10:31:29.247Z',
    'namespace': {'id': 1, 'name': 'Administrator', 'path': 'root', 'kind': 'user', 'full_path': 'root', 'parent_id': None, 'avatar_url': None,
                  'web_url': 'http://localhost:8888/root'},
    '_links': {'self': 'http://localhost:8888/api/v4/projects/2', 'issues': 'http://localhost:8888/api/v4/projects/2/issues',
               'merge_requests': 'http://localhost:8888/api/v4/projects/2/merge_requests',
               'repo_branches': 'http://localhost:8888/api/v4/projects/2/repository/branches',
               'labels': 'http://localhost:8888/api/v4/projects/2/labels',
               'events': 'http://localhost:8888/api/v4/projects/2/events', 'members': 'http://localhost:8888/api/v4/projects/2/members'},
    'packages_enabled': True, 'empty_repo': True, 'archived': False, 'visibility': 'private',
    'owner': {'id': 1, 'username': 'root', 'name': 'Administrator', 'state': 'active', 'avatar_url': None, 'web_url': 'http://localhost:8888/root'},
    'resolve_outdated_diff_discussions': False,
    'container_expiration_policy': {'cadence': '1d', 'enabled': False, 'keep_n': 10, 'older_than': '90d', 'name_regex': '.*', 'name_regex_keep': None,
                                    'next_run_at': '2021-12-31T10:31:29.308Z'}, 'issues_enabled': True, 'merge_requests_enabled': True,
    'wiki_enabled': True, 'jobs_enabled': True, 'snippets_enabled': True, 'container_registry_enabled': True, 'service_desk_enabled': False,
    'service_desk_address': None, 'can_create_merge_request_in': True, 'issues_access_level': 'enabled', 'repository_access_level': 'enabled',
    'merge_requests_access_level': 'enabled', 'forking_access_level': 'enabled', 'wiki_access_level': 'enabled', 'builds_access_level': 'enabled',
    'snippets_access_level': 'enabled', 'pages_access_level': 'private', 'operations_access_level': 'enabled', 'analytics_access_level': 'enabled',
    'container_registry_access_level': 'enabled', 'emails_disabled': None, 'shared_runners_enabled': True, 'lfs_enabled': True, 'creator_id': 1,
    'import_status': 'none', 'open_issues_count': 0, 'ci_default_git_depth': 0, 'ci_forward_deployment_enabled': True,
    'ci_job_token_scope_enabled': False,
    'public_jobs': True, 'build_timeout': 3600, 'auto_cancel_pending_pipelines': 'disabled', 'build_coverage_regex': None, 'ci_config_path': None,
    'shared_with_groups': [], 'only_allow_merge_if_pipeline_succeeds': False, 'allow_merge_on_skipped_pipeline': None,
    'restrict_user_defined_variables': False, 'request_access_enabled': True, 'only_allow_merge_if_all_discussions_are_resolved': False,
    'remove_source_branch_after_merge': True, 'printing_merge_request_link_enabled': True, 'merge_method': 'merge', 'squash_option': 'default_off',
    'suggestion_commit_message': None, 'merge_commit_template': None, 'squash_commit_template': None, 'auto_devops_enabled': False,
    'auto_devops_deploy_strategy': 'continuous', 'autoclose_referenced_issues': True, 'repository_storage': 'default', 'keep_latest_artifact': True,
    'permissions': {'project_access': {'access_level': 40, 'notification_level': 3}, 'group_access': None}
}

In /var/log/gitlab/gitlab-rails/production.log the sequence of commands looks the following way:

Started POST "/api/v4/projects" for 127.0.0.1 at 2021-12-30 15:59:39 +0700
Started PUT "/api/v4/projects/2" for 127.0.0.1 at 2021-12-30 15:59:41 +0700
Started POST "/api/v4/projects/2/repository/files/test%2Ffile%2Fpath1?branch=master&encoding=base64&commit_message=%5Bci+skip%5D+Initial+commit." for 127.0.0.1 at 2021-12-30 15:59:41 +0700
Started POST "/api/v4/internal/allowed" for 127.0.0.1 at 2021-12-30 15:59:42 +0700
Started POST "/api/v4/internal/pre_receive" for 127.0.0.1 at 2021-12-30 15:59:42 +0700
Started POST "/api/v4/internal/post_receive" for 127.0.0.1 at 2021-12-30 15:59:42 +0700
Started GET "/api/v4/projects/2/repository/commits?ref_name=master&per_page=100&page=1" for 127.0.0.1 at 2021-12-30 15:59:42 +0700
Started GET "/api/v4/projects/2/repository/commits?ref_name=master&per_page=100&page=1" for 127.0.0.1 at 2021-12-30 15:59:43 +0700
Started POST "/api/v4/projects/2/repository/branches" for 127.0.0.1 at 2021-12-30 15:59:43 +0700
Started POST "/api/v4/internal/allowed" for 127.0.0.1 at 2021-12-30 15:59:43 +0700
Started POST "/api/v4/internal/pre_receive" for 127.0.0.1 at 2021-12-30 15:59:43 +0700
Started POST "/api/v4/internal/post_receive" for 127.0.0.1 at 2021-12-30 15:59:43 +0700
Started GET "/api/v4/projects/2/repository/branches?per_page=100&page=1" for 127.0.0.1 at 2021-12-30 15:59:44 +0700
Started GET "/api/v4/projects/2/repository/branches?per_page=100&page=1" for 127.0.0.1 at 2021-12-30 15:59:44 +0700
Started GET "/api/v4/projects/2/repository/commits?all=1&per_page=100&page=1" for 127.0.0.1 at 2021-12-30 15:59:45 +0700
Started GET "/api/v4/projects/2/repository/branches?per_page=100&page=1" for 127.0.0.1 at 2021-12-30 15:59:45 +0700
Started GET "/api/v4/geo/proxy" for 127.0.0.1 at 2021-12-30 15:59:46 +0700
Started GET "/api/v4/user" for 127.0.0.1 at 2021-12-30 15:59:55 +0700
Started DELETE "/api/v4/projects/2" for 127.0.0.1 at 2021-12-30 15:59:55 +0700
Started GET "/api/v4/geo/proxy" for 127.0.0.1 at 2021-12-30 15:59:56 +0700
Started GET "/api/v4/projects?archived=0&simple=0&per_page=100&page=1" for 127.0.0.1 at 2021-12-30 16:00:05 +0700

Example Project

There is a fully functioning repro case which does all from the described above automatically:

What is the current bug behavior?

Some projects can not be removed from a GitLab instance.

What is the expected correct behavior?

All projects can be removed (if permissions are sufficient).

Relevant logs and/or screenshots

Suspicions records in /var/log/gitlab/gitlab-rails/application.log:

2021-12-30T08:59:46.543Z: Unable to create keep-around reference for repository @hashed/d4/73/d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35: 13:error when running update-ref command: state update to "commit" failed: EOF, stderr: "fatal: commit: cannot lock ref 'refs/keep-around/9420ba569a4b3e84eb41c21be29a0eac3d48e916': Unable to create '/var/opt/gitlab/git-data/repositories/@hashed/d4/73/d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35.git/refs/keep-around/9420ba569a4b3e84eb41c21be29a0eac3d48e916.lock': File exists.\n\nAnother git process seems to be running in this repository, e.g.\nan editor opened by 'git commit'. Please make sure all processes\nare terminated then try again. If it still fails, a git process\nmay have crashed in this repository earlier:\nremove the file manually to continue.\n".
2021-12-30T08:59:55.919Z: User 1 scheduled destruction of project root/f9eeb4f74c264f2396f6deee6f9941cb with job ID 57da76e65cddd75f874de970
2021-12-30T08:59:55.966Z: Attempting to destroy root/f9eeb4f74c264f2396f6deee6f9941cb (2)
2021-12-30T08:59:56.004Z: Repository "@hashed/d4/73/d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35" moved to "@hashed/d4/73/d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35+2+deleted" for repository "root/f9eeb4f74c264f2396f6deee6f9941cb"
2021-12-30T08:59:56.004Z: Repository "root/f9eeb4f74c264f2396f6deee6f9941cb" was removed
2021-12-30T08:59:56.047Z: Repository "@hashed/d4/73/d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35.wiki" moved to "@hashed/d4/73/d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35.wiki+2+deleted" for repository "root/f9eeb4f74c264f2396f6deee6f9941cb.wiki"
2021-12-30T08:59:56.047Z: Repository "root/f9eeb4f74c264f2396f6deee6f9941cb.wiki" was removed
2021-12-30T08:59:56.092Z: Deletion failed on root/f9eeb4f74c264f2396f6deee6f9941cb with the following message: no repository for such path
2021-12-30T09:00:06.980Z: User 1 scheduled destruction of project root/f9eeb4f74c264f2396f6deee6f9941cb with job ID 7aa420e810e7574b93e645d3
2021-12-30T09:00:07.040Z: Attempting to destroy root/f9eeb4f74c264f2396f6deee6f9941cb (2)
2021-12-30T09:00:07.142Z: Deletion failed on root/f9eeb4f74c264f2396f6deee6f9941cb with the following message: no repository for such path

Output of checks

Results of GitLab environment info

Expand for output related to GitLab environment info
System information
System:		
Current User:	git
Using RVM:	no
Ruby Version:	2.7.5p203
Gem Version:	3.1.4
Bundler Version:2.1.4
Rake Version:	13.0.6
Redis Version:	6.0.16
Git Version:	2.33.1.
Sidekiq Version:6.3.1
Go Version:	unknown
GitLab information
Version:	14.6.0
Revision:	3bc07a0be9c
Directory:	/opt/gitlab/embedded/service/gitlab-rails
DB Adapter:	PostgreSQL
DB Version:	12.7
URL:		http://localhost:8888
HTTP Clone URL:	http://localhost:8888/some-group/some-project.git
SSH Clone URL:	git@localhost:some-group/some-project.git
Using LDAP:	no
Using Omniauth:	yes
Omniauth Providers: 
GitLab Shell
Version:	13.22.1
Repository storage paths:
- default: 	/var/opt/gitlab/git-data/repositories
GitLab Shell path:		/opt/gitlab/embedded/service/gitlab-shell
Git:		/opt/gitlab/embedded/bin/git

Results of GitLab application Check

Expand for output related to the GitLab application check
Checking GitLab subtasks ...
Checking GitLab Shell ...
GitLab Shell: ... GitLab Shell version >= 13.22.1 ? ... OK (13.22.1)
Running /opt/gitlab/embedded/service/gitlab-shell/bin/check
Internal API available: OK
Redis available via internal API: OK
gitlab-shell self-check successful
Checking GitLab Shell ... Finished
Checking Gitaly ...
Gitaly: ... default ... OK
Checking Gitaly ... Finished
Checking Sidekiq ...
Sidekiq: ... Running? ... yes
Number of Sidekiq processes (cluster/worker) ... 1/1
Checking Sidekiq ... Finished
Checking Incoming Email ...
Incoming Email: ... Reply by email is disabled in config/gitlab.yml
Checking Incoming Email ... Finished
Checking LDAP ...
LDAP: ... LDAP is disabled in config/gitlab.yml
Checking LDAP ... Finished
Checking GitLab App ...
Git configured correctly? ... yes
Database config exists? ... yes
All migrations up? ... yes
Database contains orphaned GroupMembers? ... no
GitLab config exists? ... yes
GitLab config up to date? ... yes
Log directory writable? ... yes
Tmp directory writable? ... yes
Uploads directory exists? ... yes
Uploads directory has correct permissions? ... yes
Uploads directory tmp has correct permissions? ... skipped (no tmp uploads folder yet)
Systemd unit files or init script exist? ... skipped (omnibus-gitlab has neither init script nor systemd units)
Systemd unit files or init script up-to-date? ... skipped (omnibus-gitlab has neither init script nor systemd units)
Projects have namespace: ... can't check, you have no projects
Redis version >= 5.0.0? ... yes
Ruby version >= 2.7.2 ? ... yes (2.7.5)
Git version >= 2.33.0 ? ... yes (2.33.1)
Git user has default SSH configuration? ... yes
Active users: ... 1
Is authorized keys file accessible? ... yes
GitLab configured to store new projects in hashed storage? ... yes
All projects are in hashed storage? ... yes
Checking GitLab App ... Finished
Checking GitLab subtasks ... Finished

Possible fixes

  1. Disable Rugged via gitlab-rake gitlab:features:disable_rugged
  2. Patch system with !77941 (merged)
Edited by Stan Hu