Importing GitLab Project Export is very slow
Summary
While working on gitlab-org/quality/performance#40 (closed) we've been exploring a way to import a large dataset as efficiently as possible. Currently we import the gitlabhq project from GitHub but this can take an hour or more depending on the environment. We wanted to explore importing the same project but as a gitlab export tarball instead as it was assumed this would be quicker.
Surprisingly this has turned out to be significantly slower, around 4 hours. We're still investigating why this is the case but raising this issue to track.
Steps to reproduce
- In a clean GitLab install (e.g. docker) proceed to import the gitlabhq tarball project (can be downloaded here)
- Notice that the import will take a long time, multiple hours.
What is the current bug behavior?
Import does progress but it takes a very long time - around 4 hours.
What is the expected correct behavior?
Import should certainly finish asap. Preferably more minutes than hours.
Relevant logs and/or screenshots
- Grafana dashboard snapshot showing relevant statistics of the import (has been running for around 4 hours at this point and started at 10:38) - https://snapshot.raintank.io/dashboard/snapshot/OSu5MX1z2vnQt9eSULR0j5qVLrf39yzl
- Zipped logs - gitlab-import-logs.7z
Results of GitLab environment info
Expand for output related to GitLab environment info
System information System: Current User: git Using RVM: no Ruby Version: 2.6.3p62 Gem Version: 2.7.9 Bundler Version:1.17.3 Rake Version: 12.3.2 Redis Version: 3.2.12 Git Version: 2.21.0 Sidekiq Version:5.2.7 Go Version: unknownGitLab information Version: 12.0.3 Revision: 08a51a9db93 Directory: /opt/gitlab/embedded/service/gitlab-rails DB Adapter: PostgreSQL DB Version: 10.7 URL: http://26cb4a90f1e9 HTTP Clone URL: http://26cb4a90f1e9/some-group/some-project.git SSH Clone URL: git@26cb4a90f1e9:some-group/some-project.git Using LDAP: no Using Omniauth: yes Omniauth Providers:
GitLab Shell Version: 9.3.0 Repository storage paths:
- default: /var/opt/gitlab/git-data/repositories GitLab Shell path: /opt/gitlab/embedded/service/gitlab-shell Git: /opt/gitlab/embedded/bin/git
Results of GitLab application Check
Expand for output related to the GitLab application check
Checking GitLab subtasks ...Checking GitLab Shell ...
GitLab Shell: ... GitLab Shell version >= 9.3.0 ? ... OK (9.3.0) Running /opt/gitlab/embedded/service/gitlab-shell/bin/check Check GitLab API access: OK Redis available via internal API: OK
Access to /var/opt/gitlab/.ssh/authorized_keys: OK gitlab-shell self-check successful
Checking GitLab Shell ... Finished
Checking Gitaly ...
Gitaly: ... default ... OK
Checking Gitaly ... Finished
Checking Sidekiq ...
Sidekiq: ... Running? ... yes Number of Sidekiq processes ... 1
Checking Sidekiq ... Finished
Checking Incoming Email ...
Incoming Email: ... Reply by email is disabled in config/gitlab.yml
Checking Incoming Email ... Finished
Checking LDAP ...
LDAP: ... LDAP is disabled in config/gitlab.yml
Checking LDAP ... Finished
Checking GitLab App ...
Git configured correctly? ... yes Database config exists? ... yes All migrations up? ... yes Database contains orphaned GroupMembers? ... no GitLab config exists? ... yes GitLab config up to date? ... yes Log directory writable? ... yes Tmp directory writable? ... yes Uploads directory exists? ... yes Uploads directory has correct permissions? ... yes Uploads directory tmp has correct permissions? ... skipped (no tmp uploads folder yet) Init script exists? ... skipped (omnibus-gitlab has no init script) Init script up-to-date? ... skipped (omnibus-gitlab has no init script) Projects have namespace: ... 2/1 ... yes 3/2 ... yes Redis version >= 2.8.0? ... yes Ruby version >= 2.5.3 ? ... yes (2.6.3) Git version >= 2.21.0 ? ... yes (2.21.0) Git user has default SSH configuration? ... yes Active users: ... 1
Checking GitLab App ... Finished
Checking GitLab subtasks ... Finished