Related issues follow-up - listing query improvements
As discussed on https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/1719, preferably, we should use IssuesFinder whenever we filter issues an user can see. But, as we can see on https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/1719 description, we couldn't reach a reasonably great result till now as IssuesFinder tend to generate massive queries to work with (when not filtering by project_id or group_id for example).
We've tried the following:
- Passing the equivalent of
related_issues.select(:project_id)toIssuesFinderas a newproject_idsparam. - Passing the IDs in a new
idsparam toIssuesFinder, and it was a lot slower (several seconds). - Plucking the IDs and passing them to the finder - we weren't going to do this 'for real', but just to try - and that was also slow.
- Removing
UNIONonProjectsFinder#by_idsand changing the query to something likeproject_ids_relation.where(id: items.select(:id))(inverting query) - Slightly worse results
Probably there's room for improvement when passing :project_ids_relation param to ProjectsFinder.
Here's the query it's currently generating on IssueLinks::ListService:
SELECT "issues".* FROM (SELECT issues.*, issue_links.id AS issue_links_id
FROM issue_links, issues
WHERE (issue_links.source_id = issues.id AND issue_links.target_id = 2500240)
OR (issue_links.target_id = issues.id AND issue_links.source_id = 2500240)) issues WHERE "issues"."deleted_at" IS NULL AND "issues"."id" IN (SELECT "issues"."id" FROM "issues" INNER JOIN "projects" ON "projects"."id" = "issues"."project_id" AND "projects"."pending_delete" = 'f' WHERE "issues"."deleted_at" IS NULL AND (
issues.confidential IS NOT TRUE
OR (issues.confidential = TRUE
AND (issues.author_id = 64
OR EXISTS (SELECT TRUE FROM issue_assignees WHERE user_id = 64 AND issue_id = issues.id)
OR issues.project_id IN(SELECT "projects"."id" FROM "projects" INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id" WHERE "projects"."pending_delete" = 'f' AND "project_authorizations"."user_id" = 64 AND (project_authorizations.access_level >= 20) ORDER BY "projects"."id" DESC)))) AND "projects"."pending_delete" = 'f' AND (projects.id IN (SELECT "projects"."id" FROM "projects" INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id" WHERE "projects"."pending_delete" = 'f' AND "project_authorizations"."user_id" = 64 AND "projects"."id" IN (SELECT "issues"."project_id" FROM "issues" WHERE "issues"."deleted_at" IS NULL AND (
issues.confidential IS NOT TRUE
OR (issues.confidential = TRUE
AND (issues.author_id = 64
OR EXISTS (SELECT TRUE FROM issue_assignees WHERE user_id = 64 AND issue_id = issues.id)
OR issues.project_id IN(SELECT "projects"."id" FROM "projects" INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id" WHERE "projects"."pending_delete" = 'f' AND "project_authorizations"."user_id" = 64 AND (project_authorizations.access_level >= 20) ORDER BY "projects"."id" DESC)))))
UNION
SELECT "projects"."id" FROM "projects" WHERE "projects"."visibility_level" IN (20, 10) AND "projects"."id" IN (SELECT "issues"."project_id" FROM "issues" WHERE "issues"."deleted_at" IS NULL AND (
issues.confidential IS NOT TRUE
OR (issues.confidential = TRUE
AND (issues.author_id = 64
OR EXISTS (SELECT TRUE FROM issue_assignees WHERE user_id = 64 AND issue_id = issues.id)
OR issues.project_id IN(SELECT "projects"."id" FROM "projects" INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id" WHERE "projects"."pending_delete" = 'f' AND "project_authorizations"."user_id" = 64 AND (project_authorizations.access_level >= 20) ORDER BY "projects"."id" DESC)))))))) ORDER BY "issues"."issue_links_id" ASC;
And local explain analyze with a good amount of data (See https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/1719 description):
Nested Loop Semi Join (cost=4461474.41..11345373.65 rows=1 width=379) (actual time=990.279..990.279 rows=0 loops=1)
Join Filter: (issues.id = issues_1.id)
Rows Removed by Join Filter: 720
-> Nested Loop (cost=9.30..23209.53 rows=12 width=379) (actual time=0.036..234.170 rows=6 loops=1)
-> Index Scan using issue_links_pkey on issue_links (cost=0.42..23007.01 rows=12 width=12) (actual time=0.017..234.086 rows=6 loops=1)
Filter: ((target_id = 2500240) OR (source_id = 2500240))
Rows Removed by Filter: 599973
-> Bitmap Heap Scan on issues (cost=8.88..16.87 rows=1 width=375) (actual time=0.009..0.010 rows=1 loops=6)
Recheck Cond: ((issue_links.source_id = id) OR (issue_links.target_id = id))
Filter: ((deleted_at IS NULL) AND (((issue_links.source_id = id) AND (issue_links.target_id = 2500240)) OR ((issue_links.target_id = id) AND (issue_links.source_id = 2500240))))
Rows Removed by Filter: 1
Heap Blocks: exact=6
-> BitmapOr (cost=8.88..8.88 rows=2 width=0) (actual time=0.007..0.007 rows=0 loops=6)
-> Bitmap Index Scan on issues_pkey (cost=0.00..4.44 rows=1 width=0) (actual time=0.004..0.004 rows=1 loops=6)
Index Cond: (issue_links.source_id = id)
-> Bitmap Index Scan on issues_pkey (cost=0.00..4.44 rows=1 width=0) (actual time=0.003..0.003 rows=1 loops=6)
Index Cond: (issue_links.target_id = id)
-> Materialize (cost=4461465.11..11322163.57 rows=3 width=4) (actual time=125.965..126.000 rows=120 loops=6)
-> Nested Loop (cost=4461465.11..11322163.56 rows=3 width=4) (actual time=755.788..755.930 rows=120 loops=1)
-> Nested Loop (cost=4461450.03..4461466.54 rows=2 width=8) (actual time=755.778..755.799 rows=4 loops=1)
-> Unique (cost=4461449.60..4461449.61 rows=2 width=4) (actual time=755.768..755.769 rows=4 loops=1)
-> Sort (cost=4461449.60..4461449.61 rows=2 width=4) (actual time=755.766..755.766 rows=4 loops=1)
Sort Key: projects_1.id
Sort Method: quicksort Memory: 25kB
-> Append (cost=15.94..4461449.59 rows=2 width=4) (actual time=74.454..755.754 rows=4 loops=1)
-> Nested Loop Semi Join (cost=15.94..46.60 rows=1 width=4) (actual time=0.007..0.007 rows=0 loops=1)
-> Nested Loop (cost=0.85..14.65 rows=1 width=8) (actual time=0.006..0.006 rows=0 loops=1)
-> Index Only Scan using index_project_authorizations_on_user_id_project_id_access_level on project_authorizations (cost=0.43..6.20 rows=1 width=4) (actual time=0.006..0.006 rows=0 loops=1)
Index Cond: (user_id = 64)
Heap Fetches: 0
-> Index Scan using projects_pkey on projects projects_1 (cost=0.43..8.45 rows=1 width=4) (never executed)
Index Cond: (id = project_authorizations.project_id)
Filter: (NOT pending_delete)
-> Index Scan using index_issues_on_project_id_and_iid on issues issues_2 (cost=15.08..3424258.26 rows=609025 width=4) (never executed)
Index Cond: (project_id = projects_1.id)
Filter: ((deleted_at IS NULL) AND ((confidential IS NOT TRUE) OR (confidential AND ((author_id = 64) OR (alternatives: SubPlan 4 or hashed SubPlan 5) OR (hashed SubPlan 6)))))
SubPlan 4
-> Seq Scan on issue_assignees issue_assignees_2 (cost=0.00..5.60 rows=1 width=0) (never executed)
Filter: ((user_id = 64) AND (issue_id = issues_2.id))
SubPlan 5
-> Seq Scan on issue_assignees issue_assignees_3 (cost=0.00..5.00 rows=1 width=4) (never executed)
Filter: (user_id = 64)
SubPlan 6
-> Nested Loop (cost=0.85..14.65 rows=1 width=4) (never executed)
-> Index Only Scan Backward using index_project_authorizations_on_user_id_project_id_access_level on project_authorizations project_authorizations_2 (cost=0.43..6.20 rows=1 width=4) (never executed)
Index Cond: ((user_id = 64) AND (access_level >= 20))
Heap Fetches: 0
-> Index Scan using projects_pkey on projects projects_4 (cost=0.43..8.45 rows=1 width=4) (never executed)
Index Cond: (id = project_authorizations_2.project_id)
Filter: (NOT pending_delete)
-> Nested Loop Semi Join (cost=15.51..4461402.97 rows=1 width=4) (actual time=74.447..755.746 rows=4 loops=1)
-> Index Scan using index_projects_on_visibility_level on projects projects_2 (cost=0.43..13446.75 rows=214799 width=4) (actual time=0.015..131.133 rows=220004 loops=1)
Index Cond: (visibility_level = ANY ('{20,10}'::integer[]))
-> Index Scan using index_issues_on_project_id_and_iid on issues issues_3 (cost=15.08..3424258.94 rows=609025 width=4) (actual time=0.002..0.002 rows=0 loops=220004)
Index Cond: (project_id = projects_2.id)
Filter: ((deleted_at IS NULL) AND ((confidential IS NOT TRUE) OR (confidential AND ((author_id = 64) OR (alternatives: SubPlan 7 or hashed SubPlan 8) OR (hashed SubPlan 9)))))
SubPlan 7
-> Seq Scan on issue_assignees issue_assignees_4 (cost=0.00..5.60 rows=1 width=0) (never executed)
Filter: ((user_id = 64) AND (issue_id = issues_3.id))
SubPlan 8
-> Seq Scan on issue_assignees issue_assignees_5 (cost=0.00..5.00 rows=1 width=4) (never executed)
Filter: (user_id = 64)
SubPlan 9
-> Nested Loop (cost=0.85..14.65 rows=1 width=4) (never executed)
-> Index Only Scan Backward using index_project_authorizations_on_user_id_project_id_access_level on project_authorizations project_authorizations_3 (cost=0.43..6.20 rows=1 width=4) (never executed)
Index Cond: ((user_id = 64) AND (access_level >= 20))
Heap Fetches: 0
-> Index Scan using projects_pkey on projects projects_5 (cost=0.43..8.45 rows=1 width=4) (never executed)
Index Cond: (id = project_authorizations_3.project_id)
Filter: (NOT pending_delete)
-> Index Scan using projects_pkey on projects (cost=0.43..8.45 rows=1 width=4) (actual time=0.006..0.006 rows=1 loops=4)
Index Cond: (id = projects_1.id)
Filter: ((NOT pending_delete) AND (NOT pending_delete))
-> Index Scan using index_issues_on_project_id_and_iid on issues issues_1 (cost=15.08..3424258.26 rows=609025 width=8) (actual time=0.006..0.025 rows=30 loops=4)
Index Cond: (project_id = projects.id)
Filter: ((deleted_at IS NULL) AND ((confidential IS NOT TRUE) OR (confidential AND ((author_id = 64) OR (alternatives: SubPlan 1 or hashed SubPlan 2) OR (hashed SubPlan 3)))))
SubPlan 1
-> Seq Scan on issue_assignees (cost=0.00..5.60 rows=1 width=0) (never executed)
Filter: ((user_id = 64) AND (issue_id = issues_1.id))
SubPlan 2
-> Seq Scan on issue_assignees issue_assignees_1 (cost=0.00..5.00 rows=1 width=4) (never executed)
Filter: (user_id = 64)
SubPlan 3
-> Nested Loop (cost=0.85..14.65 rows=1 width=4) (never executed)
-> Index Only Scan Backward using index_project_authorizations_on_user_id_project_id_access_level on project_authorizations project_authorizations_1 (cost=0.43..6.20 rows=1 width=4) (never executed)
Index Cond: ((user_id = 64) AND (access_level >= 20))
Heap Fetches: 0
-> Index Scan using projects_pkey on projects projects_3 (cost=0.43..8.45 rows=1 width=4) (never executed)
Index Cond: (id = project_authorizations_1.project_id)
Filter: (NOT pending_delete)
Planning time: 5.510 ms
Execution time: 990.752 ms
cc @smcgivern