Merge Request search is slow when sorting by some fields
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Overview
We two tables relevant to this issue that store information about Merge Requests:
-
merge_requests- The bulk of the MR information, including description, project_id, created_at etc -
merge_request_metrics- Primarily timestamps of when events occurred, including merged_at and latest_closed_at
When viewing the Merge Request UI in GitLab, you have Open (default), Merged, Closed and All. The filters are pretty self explanatory and are designed to quickly filter between states while also combining a text search.
Problem
The MR pages is slow. A large part of that is the counts are slow. 4.5s the render the counts vs 300ms for the actual MR rows. There is effort to mvoe the pages to Vue which should hide some of the slowness in async calls but this does not have a due date &10827 (closed).
But the other problem is that as soon as we sort by a column that is on merge_request_metrics it becomes very slow. An easy way to see this is click Merged, which defaults to sorting by merged_at which lives on merge_request_metrics.
When this happens:
- We no longer use the
merge_requestsCTE, as it's not considered a simple sort https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/finders/issuable_finder.rb#L204 - We fall back to having to do an INNER JOIN between merge_requests and merge_request_metrics
- This also means since no CTE is used, we use an inefficient index to filter the data (for example
idx_merge_requests_on_merged_statewhich only indexes the state==3, leaving too many records to filter)
Compare the following queries:
- filter by open, with created_at (200ms) https://console.postgres.ai/gitlab/gitlab-production-main/sessions/27539/commands/85760#visualize-pev2
- filter by merged, with merged_at (2800ms) https://console.postgres.ai/gitlab/gitlab-production-main/sessions/27503/commands/85663#visualize-pev2
Suggestions
- Investigate if there is a CTE we can use for querying on these columns
- I no CTE can be used, investigate if a more efficient index can be used
- Potentially looking at moving or duplicating the affected columns back onto
merge_requests(merged_at, closed_at the priorities since they are defaults from the tabs) so we don't have to do another join.