Draft: Add user mapping to github take 2
What does this MR do and why?
This adds the user mapping feature to the github importer. It's in a draft stage as there are several different parts to user mapping in the Github Importer and this MR does not implement everything:
- Updating the user finder to find or create an import_source_user on import.
This MR has a version of this implemented, but it doesn't use the existing GH user finder/mapper cache. @.luke has commented below with a possible solution and will be working on that.(Update: we don't need the old caching as it's for old user mapping.) - Pushing placeholder references for every record imported. These references will indicate which
user_id
orauthor_id
belongs to which import_source_user so that when placeholder users are re-assigned, we know whichuser_id
s andauthor_id
s to update. - Loading the placeholder references upon each stage completion. This saves references to the
import_source_user_placeholder_references
table on the DB - Ensure the reference store is finalized before finishing the import.
- Adding a Feature flag for all work to be behind.
This MR has changes for:
- Adding a feature flag.
- Ensuring the reference store is finalized
- Loading references at the end of each stage
- Pushing references in the following importers:
- issue_importer
- note_importer
- pull_request_importer
- pull_requests: -- merged_by_importer -- review_importer
- diff_note_importer
- events: -- base_importer -- changed_assignee -- changed_label -- changed_milestone -- changed_reviewer -- closed -- cross_referenced -- renamed -- reopened
Normally the push happens using the record, reference, and source author or user_id. This uses a push_placeholder_references
method.
When records are created using legacy_bulk_insert
, an array of ids for each row created is returned. This is done when creating notes, so a push_placeholder_note_refs_by_ids
.
When records require a composite key, instead of a numeric key, push_placeholder_ref_with_composite_key
is used.
All these methods are in the in the newly created Gitlab::GithubImport::PushPlaceholderReferences
module.
Please Note: not all of the push additions are working as expected when the placeholder user is reassigned so it's likely the implementation is incorrect.
The requested_reviewer_importer has not been updated as it uses a bulk_insert
method which creates the records, bundles them, then uses legacy_bulk_import
to save them in batches to the DB. Nesting legacy_bulk_import
in this way means it's tricky to update by ids. I think this importer will need to write its own, basically identical, implementation of bulk_insert
but which pushes the references by ids once each batch is saved.
- An update to the user finder which creates placeholder and source users, and against which QA testing can be done, but which needs improving.
Also Note: Not all specs are updated. I've been focused on getting the wires to connect and didn't want to spend time on specs before I knew that the implementation was actually working.
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
Before | After |
---|---|
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
Related to #466355