Gitea project import fails or is incomplete due to `ghost` users in source project
Summary
Gitea import does not import items (issues/MRs etc) created by ghost
users in the source project, further if the item is a comment
created by a ghost
user then the entire import is marked as failed.
Steps to reproduce
-
Import a Gitea project that contains issues/MRs/comments etc. created by a
ghost
user. -
Observe that the items created by
ghost
users in the source project will not have been imported to GitLab -
Observe that the project import is marked as
Failed
when there are comments created byghost
users in the source project
What is the current bug behavior?
-
Items created by
ghost
users are not imported. -
When an error occurs importing comments, for example when the comment was created by a
ghost
user, the true error is not reported and the import is marked asFailed
.
What is the expected correct behavior?
-
All items should be imported from the source project.
-
For failing comments, the full and correct error should be logged.
Relevant logs and/or screenshots
- The following exception is seen when attempting to import an
issue
created by aghost
user (similar for MRs etc.):
{"message":"The remote data could not be fully imported.","errors":
[{"type":"issue","url":"https://gitea.com/api/v1/repos/habile/rails-test/issues/2","errors":"GET https://gitea.com/api/v1/users/Ghost: 404 - user redirect does not exist [name: ghost]"}
However if we are importing comments from a ghost
user we do not handle the error properly, instead we fail the import completely.
Error reported for failing comments:
"exception.class": "NoMethodError",
"exception.message": "undefined method 'gsub' for nil:NilClass",
expand for full stack trace
"severity": "ERROR",
"time": "2023-01-24T09:03:51.677Z",
"correlation_id": "6da9a450-7ed1-4225-8805-e11f0e0678a6",
"exception.class": "NoMethodError",
"exception.message": "undefined method `gsub' for nil:NilClass",
"exception.backtrace": [
"lib/gitlab/url_sanitizer.rb:11:in `sanitize'",
"lib/gitlab/legacy_github_import/importer.rb:252:in `rescue in block (2 levels) in create_comments'",
"lib/gitlab/legacy_github_import/importer.rb:233:in `block (2 levels) in create_comments'",
"lib/gitlab/legacy_github_import/importer.rb:232:in `each'",
"lib/gitlab/legacy_github_import/importer.rb:232:in `block in create_comments'",
"lib/gitlab/legacy_github_import/importer.rb:231:in `create_comments'",
"lib/gitlab/legacy_github_import/importer.rb:224:in `block in import_comments'",
"lib/gitlab/legacy_github_import/importer.rb:313:in `block in fetch_resources'",
"lib/gitlab/legacy_github_import/client.rb:154:in `request'",
"lib/gitlab/legacy_github_import/client.rb:65:in `method_missing'",
"lib/gitlab/legacy_github_import/importer.rb:312:in `public_send'",
"lib/gitlab/legacy_github_import/importer.rb:312:in `fetch_resources'",
"lib/gitlab/legacy_github_import/importer.rb:218:in `import_comments'",
"lib/gitlab/legacy_github_import/importer.rb:55:in `execute'",
"app/services/projects/import_service.rb:134:in `import_data'",
"app/services/projects/import_service.rb:25:in `execute'",
"app/workers/repository_import_worker.rb:27:in `perform'",
"ee/app/workers/ee/repository_import_worker.rb:9:in `perform'",
"lib/gitlab/sidekiq_middleware/duplicate_jobs/strategies/until_executing.rb:16:in `perform'",
"lib/gitlab/sidekiq_middleware/duplicate_jobs/duplicate_job.rb:44:in `perform'",
"lib/gitlab/sidekiq_middleware/duplicate_jobs/server.rb:8:in `call'",
"lib/gitlab/sidekiq_middleware/worker_context.rb:9:in `wrap_in_optional_context'",
"lib/gitlab/sidekiq_middleware/worker_context/server.rb:19:in `block in call'",
"lib/gitlab/application_context.rb:115:in `block in use'",
"lib/gitlab/application_context.rb:115:in `use'",
"lib/gitlab/application_context.rb:55:in `with_context'",
"lib/gitlab/sidekiq_middleware/worker_context/server.rb:17:in `call'",
"lib/gitlab/sidekiq_status/server_middleware.rb:7:in `call'",
"lib/gitlab/sidekiq_versioning/middleware.rb:9:in `call'",
"lib/gitlab/sidekiq_middleware/query_analyzer.rb:7:in `block in call'",
"lib/gitlab/database/query_analyzer.rb:37:in `within'",
"lib/gitlab/sidekiq_middleware/query_analyzer.rb:7:in `call'",
"lib/gitlab/sidekiq_middleware/admin_mode/server.rb:14:in `call'",
"lib/gitlab/sidekiq_middleware/instrumentation_logger.rb:9:in `call'",
"lib/gitlab/sidekiq_middleware/batch_loader.rb:7:in `call'",
"lib/gitlab/sidekiq_middleware/extra_done_log_metadata.rb:7:in `call'",
"lib/gitlab/sidekiq_middleware/request_store_middleware.rb:10:in `block in call'",
"lib/gitlab/with_request_store.rb:17:in `enabling_request_store'",
"lib/gitlab/with_request_store.rb:10:in `with_request_store'",
"lib/gitlab/sidekiq_middleware/request_store_middleware.rb:9:in `call'",
"lib/gitlab/sidekiq_middleware/server_metrics.rb:76:in `block in call'",
"lib/gitlab/sidekiq_middleware/server_metrics.rb:103:in `block in instrument'",
"lib/gitlab/metrics/background_transaction.rb:33:in `run'",
"lib/gitlab/sidekiq_middleware/server_metrics.rb:103:in `instrument'",
"lib/gitlab/sidekiq_middleware/server_metrics.rb:75:in `call'",
"lib/gitlab/sidekiq_middleware/monitor.rb:10:in `block in call'",
"lib/gitlab/sidekiq_daemon/monitor.rb:46:in `within_job'",
"lib/gitlab/sidekiq_middleware/monitor.rb:9:in `call'",
"lib/gitlab/sidekiq_middleware/size_limiter/server.rb:13:in `call'",
"lib/gitlab/sidekiq_logging/structured_logger.rb:21:in `call'"
],
Possible fixes
Changing raw[:url]
to comments.url
in importer.rb#L250 does address the premature import failure due to ghost
comments, although it will be empty '', probably :html_url
should be used?
Not sure how to address the failure to import ghost
user items however.
Proposal
In other importers, when we aren't able to match the source user email address with a confirmed email address in GitLab, we assign the User conducting the import as the author. Let's do the same here.