Load `ORDER BY` rows only by default in in optimized IN operator queries
What does this MR do and why?
This MR makes the finder_query
parameter in the IN operator optimization
(https://docs.gitlab.com/ee/development/database/efficient_in_operator_queries.html#using-the-in-query-optimization) module optional. When it's not specified, the optimization will not going to load BATCH_SIZE
rows from the disk but returns only the keyset order columns (columns specified in the ORDER BY clause). Loading these columns does not cost additional buffer reads.
After this MR we'll have two strategies:
-
RecordLoaderStrategy
, original, extracted from theQueryBuilder
class. -
OrderValuesLoaderStrategy
, new, performs better.
spec/lib/gitlab/pagination/keyset/in_operator_optimization/query_builder_spec.rb
covers both cases.
Overall, this improves the query performance when iterating over records in batches and we only need for example the id
column.
- Query when full rows are loaded: https://explain.depesz.com/s/gzRs
- Query when loading ORDER BY columns: https://explain.depesz.com/s/cUc5
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.