Skip to content

PoC: Use `WHERE id IN (0, 1, ... )` instead of `WHERE id BETWEEN X AND Y` in background migrations

What does this MR do?

This solves one of the potential problems of queue_background_migration_jobs_by_range_at_intervals method. Basically, this method does 1) Select MIN(id) and MAX(id) in each batch 2) Pass min id and max id to each background worker. (And later, each background worker SELECT again with WHERE id BETWEEN X and Y)

But if the target rows are widely spread (e.g. min id is 100,000. max id is 100,000,000), a query in a background worker will be BETWEEN id 100,000 and 100,000,000, and this will likely cause statement timeout.

My propsal is that we get a list of id for each background migration, and in the background migrations, we execute a query with WHERE id IN (X, ..., Y)

Does this MR meet the acceptance criteria?

  • Tests added for this feature/bug
  • Conform by the code review guidelines
    • Has been reviewed by a Backend maintainer
    • Has been reviewed by a Database specialist

What are the relevant issue numbers?

This MR is on top of https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/18615

Edited by Shinya Maeda

Merge request reports