User mapping - Add CSV support

For improved user mapping, group owners should also be allowed to re-assign contributions from CSV files uploaded via an API endpoint.

Since the group owner should know the format of the CSV, and which information to add to file, we will allow them to download a prepopulated CSV.

The prepopulated CSV will contain information about the source users saved in the import_source_tables which were created during the migration process.

The exported CSV will look like:

Source Host Import type Source user identifier Source user name Source username GitLab username GitLab public email
github.com github 4444 Rodrigo Tomonari rodrigo.tomonari
github.com github 5555 John Doe john.doe

The columns Source Host, Import type, Source user identifier should not be updated by the group owner, as the information will be used to find the corresponding record in the database after the user uploads the file.

The columns Source user name and Source username won't be used after the upload; they are added to the prepopulated CSV to help the group owner identify the source user.

For the reassignment, the group owner should populate one of the columns, GitLab username or GitLab email. This information will be used to find the GitLab user in the database.

Handling of emails: Match public email only, unless current user is admin

We should only reassign to the user in the email provided if the email is a public email on GitLab #455901 (comment 2016707106).

However, if the current user is in admin mode (use current_user#can_admin_all_resources? to check this), we will allow reassigning by any confirmed email (use User.find_by_any_email(email, confirmed: true)) #455901 (comment 2193933165).

Because of the security aspect of this, the MR that adds allowing admins to match by any confirmed email must have an Application Security review.

Example populated CSV

Below is an example of the populated CSV the group owner should upload

Source Host Import type Source user identifier Source user name Source username GitLab username GitLab public email
github.com github 4444 Rodrigo Tomonari rodrigo.tomonari rtomonari@gitlab.com
github.com github 5555 John Doe john.doe john.doe_gitlab
New API endpoints to add

GET bulk_reassignment_file: an endpoint to get a prepopulated CSV file for reassigning import source users in bulk. POST upload_bulk_reassign: and endpoint to upload a populated CSV file to actually reassign import source users in bulk.

Backend proposal

See #455901 (comment 1886488600) for thoughts on the implementation. Two issues are follow-ups, so these things will be addressed in later iterations:

Special merge request reviews

Have groupscalability review the MR, particularly for file-size considerations https://gitlab.com/gitlab-org/gitlab/-/issues/478305#note_2147212991.

Addition

Consider membership inheritance, see comment.

Test cases

  • Only group owners can request users to accept reassignment of contributions with CSV file.
  • If the CSV file uploaded by group owner is misformatted, it should return a descriptive error in UI.
  • If the group owner submitted well formatted and correct CSV file, they should see a banner informing that the file is being processed and that they will receive email upon end of processing.
  • Contributions of only one placeholder user can be reassigned to an active human user on destination:
  • If in a CSV file a group owner wrote twice or more the same username in GitLab username column or the same email address in GitLab public email column, upon CSV upload error is thrown, with a message that this is not accepted.
Edited by James Nutt