[VSA] Use the in-operator optimization when loading records

The aggregated group-level queries for loading the records endpoint are not very efficient due to the IN query on group_id. Example: https://explain.depesz.com/s/WJEa

This would be a good candidate for the in operator optimization: https://docs.gitlab.com/ee/development/database/efficient_in_operator_queries.html

How:

In the data_collector method we build a scope with the BaseQueryBuilder class. This scope is used to calculate various metrics: median, average, count, records

When the records method is called we can build a slightly different version of the BaseQueryBuilder scope to use the IN query optimization.

Example:

def records_fetcher
  base_query = if group_stage? && has_any_filter_params? # author_id or milestone_id or labal_ids 
    query # use the "original" base query
  else
    BaseQueryBuilder.new(stage: stage, params: params).build_in_operator_optimized_query
  end

  RecordsFetcher.new(stage: stage, query: base_query, params: params)
end

The build_in_operator_optimized_query method would do something like this: https://docs.gitlab.com/ee/development/database/efficient_in_operator_queries.html#basic-usage-of-querybuilder

Edited Nov 12, 2021 by Adam Hegyi