Skip to content

GitHub importer: Keep track of internal_ids

Andreas Brandl requested to merge ab-49754-gh-importer-internal-ids into master

What does this MR do?

This MR adds callbacks to the GitHub importer to keep track of the greatest value given out for any of Issue, MergeRequest and Milestone.

Are there points in the code the reviewer needs to double check?

Getting this consistent for Milestone bulk inserts is a bit tricky. See https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/20926/diffs?commit_id=e788018946576edd50eea2fe6d4fafeafa902615.

We cannot record the greatest iid after a batch was inserted because we risk inconsistency (we're not in a transaction scope here, hence if we fail to track the greatest iid, the state is inconsistent).

I've added a callback that gets called right before the insert of a slice. The alternative solution would be to iterate milestones twice, but I wanted to avoid this. Is this ok or too complicated?

Why was this MR needed?

The internal id scheme works consistently only if callbacks are enabled whenever a relevant model is inserted. For the GitHub importer, we have callbacks disabled and rather directly insert into the database for performance reasons. Hence, there's room for the internal id scheme to get inconsistent. Worst case, this leads to an unrecoverable state prohibiting the creation of new model instances.

Does this MR meet the acceptance criteria?

What are the relevant issue numbers?

Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/49754.

Edited by Yorick Peterse

Merge request reports