Optimize compliance violations query
Problem to solve
The default query used to fetch compliance violations in the compliance report has the following issues:
- The current query plan using
recursive CTE
shows it takes679.987 ms
and processes ~5.70 GiB to return 20 rows. - The elastic logs show the current query using
transversal ids
to be quite slow (~2.4s
) to return a result.- Note: The
use_traversal_ids
feature flag is enabled on gitlab.com.
- Note: The
As a result a database maintainer recommended we first improve the current query before we add additional filters (&7879 (closed)) that reduces performance even more.
Additional information
Default query parameters used in the compliance report:
- Date range:
30 days
- Sort:
SEVERITY_LEVEL_DESC
- Limit:
20
Recommendations
As noted by @krasio in this comment, the compliance violations query suffers from performance problems with IN
queries:
If we create a index on
merge_requests_compliance_violations (merge_request_id, severity_level DESC, id DESC)
we can apply the solution described in the above guide. This will only for sorting onseverity_level DESC, id DESC
, other orders will need their own index. Not sure how easy (or even possible) will it be to combine this solution with GraphQL.
The gist of it will look something like
Gitlab::Pagination::Keyset::InOperatorOptimization::QueryBuilder.new(
scope: MergeRequests::ComplianceViolation.order(order),
array_scope: MergeRequest::Metrics.where(target_project_id: Group.find(22).all_projects.select(:id), merged_at: (Time.now - 1.month).beginning_of_day..Time.now.end_of_day).select(:merge_request_id),
array_mapping_scope: -> (id_expression) { MergeRequests::ComplianceViolation.where(MergeRequests::ComplianceViolation.arel_table[:merge_request_id].eq(id_expression)) },
finder_query: -> (severity_level_expression, id_expression) { MergeRequests::ComplianceViolation.where(MergeRequests::ComplianceViolation.arel_table[:id].eq(id_expression)) }
).execute.limit(20)
Which will produce query like the https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10199/commands/36162.
🙀 This all is to improve the current query. Then we can think of applying filter by protected branches (the goal of this MR), which may or may not work.
With the optimization the results improve from 679.987 ms
to 200.187 ms
, and data processed from ~5.70 GiB to ~246.10 MiB.
Considerations
Before we consider using InOperatorOptimization
we should answer the following:
- Is it compatible with GraphQL resolvers?
- We'd need to create 8 indexes, one for each SORT type, is that feasible?
- Is this approach compatible with optional filters?
- Would it support additional filters like branch names or status &7879 (closed) that add additional
IN
queries?
Implementation plan
...