[VSA] Use the in-operator optimization when loading records
The aggregated group-level queries for loading the records
endpoint are not very efficient due to the IN
query on group_id
. Example: https://explain.depesz.com/s/WJEa
This would be a good candidate for the in operator optimization: https://docs.gitlab.com/ee/development/database/efficient_in_operator_queries.html
How:
In the data_collector method we build a scope with the BaseQueryBuilder
class. This scope is used to calculate various metrics: median, average, count, records
When the records
method is called we can build a slightly different version of the BaseQueryBuilder
scope to use the IN query optimization.
Example:
def records_fetcher
base_query = if group_stage? && has_any_filter_params? # author_id or milestone_id or labal_ids
query # use the "original" base query
else
BaseQueryBuilder.new(stage: stage, params: params).build_in_operator_optimized_query
end
RecordsFetcher.new(stage: stage, query: base_query, params: params)
end
The build_in_operator_optimized_query
method would do something like this: https://docs.gitlab.com/ee/development/database/efficient_in_operator_queries.html#basic-usage-of-querybuilder