Integrate improved user mapping with Direct Transfer
What does this MR do and why?
Related to: #443557 (closed)
Update Direct Transfer to map contributions to placeholder users for every existing member in the source instance. Contributions created by non-members are assigned to the Import User.
Note
Follow-up MRs will be created to implement some missing features:
- Handle mentions in notes and description
- Check if all references were saved before marking the migration as finished
- Retry the pipeline in case of a Gitlab::Import::SourceUserMapper::FailedToObtainLockError error
- Populate source user names for non-members
- Cache source user details in Redis to reduce database queries
How it works
In summary, the MR changes the MembersPipeline and the NdJsonPipelines. The MembersPipeline pipeline creates an Import::SourceUser and a placeholder user for every member in the source instance. The NdJsonPipelines pipeline creates an Import::SourceUser for every user ID contained in the relation hash that was not created in the MembersPipeline, that is, for all user IDs that aren't members in the source instance. Then, the NdJsonPipelines updates the user IDs found in the relation hash with the corresponding user ID mapped to the Import::SourceUser.
See a more detailed explanation below:
MembersPipeline
When the MembersPipeline is executed, an Import::SourceUser and a placeholder user are created for each member in the source instance. The Import::SourceUser records are created with the source_user_identifier set to the user ID of the source instance. Furthermore, the source_name, source_username, and source_hostname fields are populated with the corresponding information from the source instance.
Placeholder users are added as members of the group/project, and their access level is set to match that of the source instance. If an existing Import::SourceUser record already exists for the source_user_identifier, a placeholder user is not created, and instead, the mapped user (which can be a placeholder or real user) is added as a member. It's important to note that if the user is already a member, a new member is not added, and the access level of the existing member is preserved. Also, a member is not added if the placeholder user is of the type ImportUser.
NdJsonPipelines
When the NdJsonPipeline#transform
method is executed, it searches for an Import::SourceUser with a matching source_user_identifier for every user ID in the relation hash. If none is found, a new Import::SourceUser record is created and the source_user_identifier is set to the same user ID. All created Import::SourceUser are mapped to the same ImportUser, which means all contributions for non-members will be assigned to the ImportUser. Note that currently, the source_name and source_username for the created Import::SourceUser records are set to nil. A subsequent MR will introduce a new pipeline to fetch the information from the source instance and populate the missing details.
For instance, consider a hash-like relation:
{ iid: 1, author_id: 101, merged_by: 102, title: 'MR', notes: [{ note: 'Note 1', author_id: 102 }, { note: 'Note 2', author_id 103 } }] }
If none of the user IDs is found, NdJsonPipeline#transform could create an Import::SourceUser with the source_user_identifies 101, 102, and 103 and the source user would be mapped to the ImportUser.
After creating all missing Import::SourceUser, RelationFactory is executed and the new SourceUsersMapper is provided. This mapper returns the corresponding user in the destination mapped to the source_user_identifier.
For example, SourceUsersMapper#map[101]
returns the user ID of the user mapped to an Import::SourceUser with the source_user_identifier 101. If the Import::SourceUser was reassigned to a real user, SourceUsersMapper#map[101]
will return the user ID of the real user. Otherwise, it returns the user ID of a placeholder user.
Therefore, RelationFactory will build the relation object will all user IDs set to the appropriate user in the destination. Besides, for every built object, the attribute source_user_references
will hold a map of the source user IDs, which later is used to push the placeholder user references.
For instance, consider a hash-like relation:
{ iid: 1, author_id: 101, merged_by: 102, title: 'MR', notes: [{ note: 'Note 1', author_id: 102 }, { note: 'Note 2', author_id 103 } }] }
RelationFactory will return an object like:
MergeRequest.new(
title: 'MR'
author_id: 1
merged_by: 2,
source_user_references: { 'author_id' => 101, 'merged_by' => 102 },
notes: [
Note.new(
note: 'Note 1',
author_id: 2,
source_user_references: { 'author_id' => 102 }
),
Note.new(
note: 'Note 2',
author_id: 3,
source_user_references: { 'author_id' => 103 }
)
]
)
Later, it is persisted by ObjectBuilder as before in the NdJsonPipeline#load method.
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
Before | After |
---|---|
How to set up and validate locally
- Enable the feature flags:
importer_user_mapping
andbulk_import_importer_user_mapping
Feature.enable(:member_areas_of_focus) Feature.enable(:bulk_import_importer_user_mapping)
- New group, import group
- Provide a host user and access token
- Select the group to be imported
- Check if placeholder users were created and contributions assigned to them