GitHub importer: Keep track of internal_ids
What does this MR do?
This MR adds callbacks to the GitHub importer to keep track of the greatest value given out for any of Issue
, MergeRequest
and Milestone
.
Are there points in the code the reviewer needs to double check?
Getting this consistent for Milestone
bulk inserts is a bit tricky. See https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/20926/diffs?commit_id=e788018946576edd50eea2fe6d4fafeafa902615.
We cannot record the greatest iid
after a batch was inserted because we risk inconsistency (we're not in a transaction scope here, hence if we fail to track the greatest iid
, the state is inconsistent).
I've added a callback that gets called right before the insert of a slice. The alternative solution would be to iterate milestones twice, but I wanted to avoid this. Is this ok or too complicated?
Why was this MR needed?
The internal id scheme works consistently only if callbacks are enabled whenever a relevant model is inserted. For the GitHub importer, we have callbacks disabled and rather directly insert into the database for performance reasons. Hence, there's room for the internal id scheme to get inconsistent. Worst case, this leads to an unrecoverable state prohibiting the creation of new model instances.
Does this MR meet the acceptance criteria?
-
Changelog entry added, if necessary -
Tests added for this feature/bug - Conforms to the code review guidelines
-
Has been reviewed by a Backend maintainer -
Has been reviewed by a Database specialist
-
-
Conforms to the merge request performance guidelines -
Conforms to the style guides -
Conforms to the database guides -
If you have multiple commits, please combine them into a few logically organized commits by squashing them -
End-to-end tests pass ( package-and-qa
manual pipeline job)
What are the relevant issue numbers?
Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/49754.