Skip to content

Load `ORDER BY` rows only by default in in optimized IN operator queries

Adam Hegyi requested to merge in-operator-keyset-order-strategy into master

What does this MR do and why?

This MR makes the finder_query parameter in the IN operator optimization(https://docs.gitlab.com/ee/development/database/efficient_in_operator_queries.html#using-the-in-query-optimization) module optional. When it's not specified, the optimization will not going to load BATCH_SIZE rows from the disk but returns only the keyset order columns (columns specified in the ORDER BY clause). Loading these columns does not cost additional buffer reads.

After this MR we'll have two strategies:

  • RecordLoaderStrategy, original, extracted from the QueryBuilder class.
  • OrderValuesLoaderStrategy, new, performs better.

spec/lib/gitlab/pagination/keyset/in_operator_optimization/query_builder_spec.rb covers both cases.

Overall, this improves the query performance when iterating over records in batches and we only need for example the id column.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Adam Hegyi

Merge request reports