Integrate improved user mapping with Direct Transfer
What does this MR do and why?
Related to: #443557 (closed)
Update Direct Transfer to map contributions to placeholder users for every existing member in the source instance. Contributions created by non-members are assigned to the Import User.
Note
Follow-up MRs will be created to implement some missing features:
- Handle mentions in notes and description
- Check if all references were saved before marking the migration as finished
- Retry the pipeline in case of a Gitlab::Import::SourceUserMapper::FailedToObtainLockError error
- Populate source user names for non-members
- Cache source user details in Redis to reduce database queries
How it works
In summary, the MR changes the MembersPipeline and the NdJsonPipelines. The MembersPipeline pipeline creates an Import::SourceUser and a placeholder user for every member in the source instance. The NdJsonPipelines pipeline creates an Import::SourceUser for every user ID contained in the relation hash that was not created in the MembersPipeline, that is, for all user IDs that aren't members in the source instance. Then, the NdJsonPipelines updates the user IDs found in the relation hash with the corresponding user ID mapped to the Import::SourceUser.
See a more detailed explanation below:
MembersPipeline
When the MembersPipeline is executed, an Import::SourceUser and a placeholder user are created for each member in the source instance. The Import::SourceUser records are created with the source_user_identifier set to the user ID of the source instance. Furthermore, the source_name, source_username, and source_hostname fields are populated with the corresponding information from the source instance.
Placeholder users are added as members of the group/project, and their access level is set to match that of the source instance. If an existing Import::SourceUser record already exists for the source_user_identifier, a placeholder user is not created, and instead, the mapped user (which can be a placeholder or real user) is added as a member. It's important to note that if the user is already a member, a new member is not added, and the access level of the existing member is preserved. Also, a member is not added if the placeholder user is of the type ImportUser.
NdJsonPipelines
When the NdJsonPipeline#transform
method is executed, it searches for an Import::SourceUser with a matching source_user_identifier for every user ID in the relation hash. If none is found, a new Import::SourceUser record is created and the source_user_identifier is set to the same user ID. All created Import::SourceUser are mapped to the same ImportUser, which means all contributions for non-members will be assigned to the ImportUser. Note that currently, the source_name and source_username for the created Import::SourceUser records are set to nil. A subsequent MR will introduce a new pipeline to fetch the information from the source instance and populate the missing details.
For instance, consider a hash-like relation:
{ iid: 1, author_id: 101, merged_by: 102, title: 'MR', notes: [{ note: 'Note 1', author_id: 102 }, { note: 'Note 2', author_id 103 } }] }
If none of the user IDs is found, NdJsonPipeline#transform could create an Import::SourceUser with the source_user_identifies 101, 102, and 103 and the source user would be mapped to the ImportUser.
After creating all missing Import::SourceUser, RelationFactory is executed and the new SourceUsersMapper is provided. This mapper returns the corresponding user in the destination mapped to the source_user_identifier.
For example, SourceUsersMapper#map[101]
returns the user ID of the user mapped to an Import::SourceUser with the source_user_identifier 101. If the Import::SourceUser was reassigned to a real user, SourceUsersMapper#map[101]
will return the user ID of the real user. Otherwise, it returns the user ID of a placeholder user.
Therefore, RelationFactory will build the relation object will all user IDs set to the appropriate user in the destination. Besides, for every built object, the attribute source_user_references
will hold a map of the source user IDs, which later is used to push the placeholder user references.
For instance, consider a hash-like relation:
{ iid: 1, author_id: 101, merged_by: 102, title: 'MR', notes: [{ note: 'Note 1', author_id: 102 }, { note: 'Note 2', author_id 103 } }] }
RelationFactory will return an object like:
MergeRequest.new(
title: 'MR'
author_id: 1
merged_by: 2,
source_user_references: { 'author_id' => 101, 'merged_by' => 102 },
notes: [
Note.new(
note: 'Note 1',
author_id: 2,
source_user_references: { 'author_id' => 102 }
),
Note.new(
note: 'Note 2',
author_id: 3,
source_user_references: { 'author_id' => 103 }
)
]
)
Later, it is persisted by ObjectBuilder as before in the NdJsonPipeline#load method.
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
Before | After |
---|---|
How to set up and validate locally
- Enable the feature flags:
importer_user_mapping
andbulk_import_importer_user_mapping
Feature.enable(:member_areas_of_focus) Feature.enable(:bulk_import_importer_user_mapping)
- New group, import group
- Provide a host user and access token
- Select the group to be imported
- Check if placeholder users were created and contributions assigned to them
Merge request reports
Activity
assigned to @rodrigo.tomonari
added pipelinetier-3 label
7 Warnings This merge request is quite big (1658 lines changed), please consider splitting it into multiple merge requests. 00c5465a: Commits that change 30 or more lines across at least 3 files should describe these changes in the commit body. For more information, take a look at our Commit message guidelines. 565d3f6e: Commits that change 30 or more lines across at least 3 files should describe these changes in the commit body. For more information, take a look at our Commit message guidelines. a90c5e73: Commits that change 30 or more lines across at least 3 files should describe these changes in the commit body. For more information, take a look at our Commit message guidelines. f06a85e8: Commits that change 30 or more lines across at least 3 files should describe these changes in the commit body. For more information, take a look at our Commit message guidelines. 0a74cbf1: Commits that change 30 or more lines across at least 3 files should describe these changes in the commit body. For more information, take a look at our Commit message guidelines. 282aa0f6: Commits that change 30 or more lines across at least 3 files should describe these changes in the commit body. For more information, take a look at our Commit message guidelines. 1 Message CHANGELOG missing: If this merge request needs a changelog entry, add the
Changelog
trailer to the commit message you want to add to the changelog.If this merge request doesn't need a CHANGELOG entry, feel free to ignore this message.
Reviewer roulette
Category Reviewer Maintainer backend @marcogreg
(UTC+8, 11 hours ahead of author)
@alejandro
(UTC-4, 1 hour behind author)
~"Verify" Reviewer review is optional for ~"Verify" @stanhu
(UTC-7, 4 hours behind author)
Please refer to documentation page for guidance on how you can benefit from the Reviewer Roulette, or use the GitLab Review Workload Dashboard to find other available reviewers.
If needed, you can retry the
danger-review
job that generated this comment.Generated by
DangerEdited by Ghost User85 85 ) 86 86 bulk_import.create_configuration!(credentials.slice(:url, :access_token)) 87 87 88 if Feature.enabled?(:importer_user_mapping, current_user) 89 ::Import::BulkImports::EphemeralData.new(bulk_import.id).enable_importer_user_mapping 90 end - Comment on lines +88 to +90
Storing the feature flag status in Redis to ensure that enabling or disabling the feature flag does not impact in-flight migrations.
On the other importers, we can save the information in the database as we have a table to store such information.
changed this line in version 2 of the diff
Generated bygitlab_quality-test_tooling
.
Slow tests detected in this merge request. These slow tests might be related to this merge request's changes.Click to expand
Job File Name Duration Expected duration #7422352764 ee/spec/models/epic_spec.rb#L1030
Epic relative positioning there is a parent behaves like a class that supports relative positioning .move_nulls_to_end can move many nulls 52.77 s < 45.4 s #7432864282 ee/spec/models/epic_spec.rb#L1030
Epic relative positioning there is a parent behaves like a class that supports relative positioning .move_nulls_to_end can move many nulls 45.47 s < 45.4 s #7441716224 ee/spec/models/epic_spec.rb#L1030
Epic relative positioning there is a parent behaves like a class that supports relative positioning .move_nulls_to_end can move many nulls 46.56 s < 45.4 s Edited by Ghost User- A deleted user
added rspec:slow test detected label
changed milestone to %17.3
mentioned in issue #443557 (closed)
12 12 end 13 13 14 def find_or_create_internal_user(source_name:, source_username:, source_user_identifier:) 15 @source_name = source_name 16 @source_username = source_username 17 @source_user_identifier = source_user_identifier 14 def find_source_user(source_user_identifier) 15 ::Import::SourceUser.find_source_user( 16 source_user_identifier: source_user_identifier, 17 namespace: namespace, 18 source_hostname: source_hostname, 19 import_type: import_type 20 ) 21 end 22 23 def find_or_create_source_user(source_name:, source_username:, source_user_identifier:, use_import_user: false) I changed the class to return the Import::SourceUser instead of the internal user, as we usually need access to the source user.
I need to check if it's possible to keep returning the internal_user
Edited by Rodrigo Tomonarichanged this line in version 13 of the diff
added 901 commits
-
92106cda...9d669d32 - 899 commits from branch
master
- 4335eba3 - Integrate improved user mapping with Direct Transfer
- 282aa0f6 - Refactor code and add some specs
-
92106cda...9d669d32 - 899 commits from branch
- A deleted user
added feature flag label
E2E Test Result Summary
allure-report-publisher
generated test report!e2e-test-on-gdk:
test report for 00c5465aexpand test summary
+------------------------------------------------------------------+ | suites summary | +-------------+--------+--------+---------+-------+-------+--------+ | | passed | failed | skipped | flaky | total | result | +-------------+--------+--------+---------+-------+-------+--------+ | Release | 5 | 0 | 0 | 0 | 5 | ✅ | | Verify | 44 | 0 | 2 | 0 | 46 | ✅ | | Create | 127 | 0 | 12 | 0 | 139 | ✅ | | Govern | 71 | 0 | 0 | 0 | 71 | ✅ | | Plan | 70 | 0 | 0 | 0 | 70 | ✅ | | Data Stores | 31 | 0 | 1 | 0 | 32 | ✅ | | Manage | 0 | 0 | 2 | 0 | 2 | ➖ | | Package | 16 | 0 | 15 | 0 | 31 | ✅ | | Analytics | 2 | 0 | 0 | 0 | 2 | ✅ | | Monitor | 8 | 0 | 0 | 0 | 8 | ✅ | | Fulfillment | 2 | 0 | 0 | 0 | 2 | ✅ | | Secure | 2 | 0 | 0 | 0 | 2 | ✅ | +-------------+--------+--------+---------+-------+-------+--------+ | Total | 378 | 0 | 32 | 0 | 410 | ✅ | +-------------+--------+--------+---------+-------+-------+--------+
e2e-package-and-test:
test report for 00c5465aexpand test summary
+-------------------------------------------------------------+ | suites summary | +--------+--------+--------+---------+-------+-------+--------+ | | passed | failed | skipped | flaky | total | result | +--------+--------+--------+---------+-------+-------+--------+ | Manage | 6 | 0 | 14 | 2 | 20 | ✅ | +--------+--------+--------+---------+-------+-------+--------+ | Total | 6 | 0 | 14 | 2 | 20 | ✅ | +--------+--------+--------+---------+-------+-------+--------+
Edited by Ghost User164 # Creates an Import::SourceUser objects for each source_user_identifier 165 # found in the relation_hash and associate it with the ImportUser. 166 def create_import_source_users(_relation_key, relation_hash) 167 relation_factory::USER_REFERENCES.each do |reference| 168 next unless relation_hash[reference] 169 170 source_user_mapper.find_or_create_source_user( 171 source_name: nil, 172 source_username: nil, 173 source_user_identifier: relation_hash[reference], 174 use_import_user: true 175 ) 176 end 177 end 178 179 # rubocop:disable GitlabSecurity/PublicSend -- only methods in the relation_definition are called A disabled RuboCop security rule was detected. Please try to avoid this if at all possible, otherwise an AppSec review is recommended.
This AppSec automation is currently under testing. Use appsec-sasthelpful or appsec-sastunhelpful for quick feedback. For any detailed feedback, add a comment here.
- Resolved by Rodrigo Tomonari