Skip to content

Improve query there is specified visibility_level

Thong Kuah requested to merge projects_finder_visibility_optimization into master

What does this MR do?

Related issue: #37007 (closed)

See Analysis for why the old query was in-efficient.

For /api/v4/projects where params[:visibility] is private...

New query:

SELECT "projects".*
FROM "projects"
INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id"
WHERE "project_authorizations"."user_id" = 3067627
  AND "projects"."visibility_level" = 0
  AND "projects"."pending_delete" = FALSE
ORDER BY "projects"."created_at" DESC,
         "projects"."id" DESC
LIMIT 20
OFFSET 0

Cost: 586.59

Time: 5.319 ms
  - planning: 4.577 ms
  - execution: 0.742 ms

Shared buffers:
  - hits: 195 (~1.50 MiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0
Plan - https://explain.depesz.com/s/UOKD
 Limit  (cost=586.33..586.38 rows=20 width=583) (actual time=157.910..157.919 rows=20 loops=1)
   Buffers: shared dirtied=5 hit=91 read=104
   ->  Sort  (cost=586.33..586.59 rows=106 width=583) (actual time=157.909..157.912 rows=20 loops=1)
         Sort Key: projects.created_at DESC, projects.id DESC
         Sort Method: top-N heapsort  Memory: 32kB
         Buffers: shared dirtied=5 hit=91 read=104
         ->  Nested Loop  (cost=1.00..583.51 rows=106 width=583) (actual time=17.020..157.681 rows=42 loops=1)
               Buffers: shared dirtied=5 hit=85 read=104
               ->  Index Only Scan using index_project_authorizations_on_user_id_project_id_access_level on public.project_authorizations  (cost=0.56..34.31 rows=123 width=4) (actual time=12.056..25.605 rows=42 loops=1)
                     Index Cond: (project_authorizations.user_id = 3067627)
                     Heap Fetches: 8
                     Buffers: shared dirtied=1 hit=10 read=11
               ->  Index Scan using projects_pkey on public.projects  (cost=0.43..4.46 rows=1 width=583) (actual time=3.140..3.141 rows=1 loops=42)
                     Index Cond: (projects.id = project_authorizations.project_id)
                     Filter: ((NOT projects.pending_delete) AND (projects.visibility_level = 0))
                     Rows Removed by Filter: 0
                     Buffers: shared dirtied=4 hit=75 read=93

Old query:

SELECT "projects".*
FROM "projects"
WHERE (EXISTS
         (SELECT 1
          FROM "project_authorizations"
          WHERE "project_authorizations"."user_id" = 3067627
            AND (project_authorizations.project_id = projects.id))
       OR projects.visibility_level IN (10,
                                        20))
  AND "projects"."visibility_level" = 0
  AND "projects"."pending_delete" = FALSE
ORDER BY "projects"."created_at" DESC,
         "projects"."id" DESC
LIMIT 20
OFFSET 0


Cost: 45314378.30

Time: 20.621 s
  - planning: 4.258 ms
  - execution: 20.616 s
    - I/O read: 12.787 s

Shared buffers:
  - hits: 2809240 (~21.40 GiB) from the buffer pool
  - reads: 900140 (~6.90 GiB) from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0
Plan - https://explain.depesz.com/s/6TKM
 Limit  (cost=0.56..166.53 rows=20 width=583) (actual time=484.514..20616.279 rows=20 loops=1)
   Buffers: shared hit=2809240 read=900140
   ->  Index Scan using index_projects_on_visibility_level_and_created_at_and_id on public.projects  (cost=0.56..45314378.30 rows=5460632 width=583) (actual time=484.513..20616.246 rows=20 loops=1)
         Index Cond: (projects.visibility_level = 0)
         Filter: ((NOT projects.pending_delete) AND ((alternatives: SubPlan 1 or hashed SubPlan 2) OR (projects.visibility_level = ANY ('{10,20}'::integer[]))))
         Rows Removed by Filter: 3677338
         Buffers: shared hit=2809240 read=900140
         SubPlan 1
           ->  Index Only Scan using index_project_authorizations_on_user_id_project_id_access_level on public.project_authorizations  (cost=0.56..4.58 rows=1 width=0) (actual time=0.000..0.000 rows=0 loops=0)
                 Index Cond: ((project_authorizations.user_id = 3067627) AND (project_authorizations.project_id = projects.id))
                 Heap Fetches: 0
         SubPlan 2
           ->  Index Only Scan using index_project_authorizations_on_user_id_project_id_access_level on public.project_authorizations project_authorizations_1  (cost=0.56..34.31 rows=123 width=4) (actual time=0.044..0.100 rows=42 loops=1)
                 Index Cond: (project_authorizations_1.user_id = 3067627)
                 Heap Fetches: 8
                 Buffers: shared hit=21

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Edited by 🤖 GitLab Bot 🤖

Merge request reports