Github PR mentions in comments map by username when imported to GitLab
Problem
When GitHub pull requests are imported to GitLab, authorship map correctly, by email
, but user mentions seem to map by username
. On GitLab.com this can end up being a completely different user than on the source, and similar might apply on a self-managed GitLab instance. This is both confusing and might cause concerns for end-users.
Solution
The solution would be to step through every pull request description and note and issue description and note and update the user references with project-specific users.
We need to have a map between usernames and emails on github. We can update CollaboratorImporter
to write each direct collaborator to the cache with cache structure: project/#{project.id}/username/#{username}
=> #{email}
. This also means that CollaboratorImporter
should be executed before PullRequestsImporter
, IssuesImporter
, DiffNotesImporter
and NotesImporter
.
Once we get to the importer steps mentioned above, we can check for @
symbols and find the corresponding email for the username. We should then find a user record matching the email and if found, replace the username with the found user's username. If not found, we don't want to keep the reference since it may tag incorrect users. In this case we should format it between backticks.
Bitbucket Server implementation
- Cache users from Bitbucket Server (!139097 - merged)
- Use PageCounter to keep track of imported pages (!139775 - merged)
- Convert mentions on pull requests from Bitbucke... (!139221 - merged)
The MentionsConverter
introduced in !139221 (merged) could be moved out of the BitbucketServerImport
namespace so that it can be reused by other importers.
See #433008 (comment 1699072845) for more info.