Skip to content

Improve performance of public snippets API endpoint

Alex Pooley requested to merge ap-35389-api-snippets-public-personal into master

What does this MR do?

Solves issue #35389

Improve performance of /api/:version/snippets/public API and only return public personal snippets.

The majority of the performance gains were found by ignoring project snippets in the query. With project snippets we need to run a three query UNION with some inner joins. Without project snippets we can simply scan the snippets table.

There is also an index added to the snippets created_at column to improve the performance of sort ordering.

Query plans

The core query change without the created_at index:

Before
SELECT "snippets".* FROM ((SELECT "snippets".* FROM "snippets" WHERE (snippets.visibility_level IN (10,20) OR snippets.author_id = 4092171) AND "snippets"."project_id" IS NULL)
UNION
(SELECT "snippets".* FROM "snippets" INNER JOIN "projects" ON "projects"."id" = "snippets"."project_id" INNER JOIN "project_features" ON "project_features"."project_id" = "projects"."id" WHERE (snippets.visibility_level IN (10,20) OR snippets.author_id = 4092171) AND (projects.visibility_level IN (10,20)) AND "project_features"."snippets_access_level" IN (20, 30))
UNION
(SELECT "snippets".* FROM "snippets" INNER JOIN "projects" ON "projects"."id" = "snippets"."project_id" INNER JOIN "project_features" ON "project_features"."project_id" = "projects"."id" WHERE "project_features"."snippets_access_level" IN (20, 30, 10) AND (EXISTS (SELECT 1 FROM "project_authorizations" WHERE (project_id = snippets.project_id) AND "project_authorizations"."user_id" = 4092171)))) snippets WHERE "snippets"."visibility_level" = 20 ORDER BY created_at DESC

--- QUERY PLAN ---
Sort  (cost=93639.06..93671.54 rows=12989 width=1744) (actual time=13982.518..14011.068 rows=49992 loops=1)
  Sort Key: snippets.created_at DESC
  Sort Method: quicksort  Memory: 76120kB
  Buffers: shared hit=306865 read=80163
  I/O Timings: read=8209.162
  ->  HashAggregate  (cost=92491.81..92621.70 rows=12989 width=1744) (actual time=13851.068..13900.894 rows=49992 loops=1)
        Group Key: snippets.id, snippets.title, snippets.content, snippets.author_id, snippets.project_id, snippets.created_at, snippets.updated_at, snippets.file_name, snippets.type, snippets.visibility_level, snippets.title_html, snippets.content_html, snippets.cached_markdown_version, snippets.description, snippets.description_html
        Buffers: shared hit=306862 read=80163
        I/O Timings: read=8209.162
        ->  Append  (cost=215.80..92004.72 rows=12989 width=1744) (actual time=13.185..503.752 rows=50316 loops=1)
              Buffers: shared hit=137176 read=458
              I/O Timings: read=25.433
              ->  Bitmap Heap Scan on snippets  (cost=215.80..27777.04 rows=11193 width=1080) (actual time=13.184..178.426 rows=45568 loops=1)
                    Recheck Cond: (((project_id IS NULL) AND (visibility_level = ANY ('{10,20}'::integer[])) AND (visibility_level = 20)) OR (author_id = 4092171))
                    Filter: ((project_id IS NULL) AND (visibility_level = 20))
                    Heap Blocks: exact=21778
                    Buffers: shared hit=22002 read=1
                    I/O Timings: read=0.035
                    ->  BitmapOr  (cost=215.80..215.80 rows=11196 width=0) (actual time=9.055..9.055 rows=0 loops=1)
                          Buffers: shared hit=224 read=1
                          I/O Timings: read=0.035
                          ->  Bitmap Index Scan on index_snippets_on_project_id_and_visibility_level  (cost=0.00..208.27 rows=11193 width=0) (actual time=8.911..8.911 rows=45669 loops=1)
                                Index Cond: ((project_id IS NULL) AND (visibility_level = ANY ('{10,20}'::integer[])) AND (visibility_level = 20))
                                Buffers: shared hit=222
                          ->  Bitmap Index Scan on index_snippets_on_author_id  (cost=0.00..1.94 rows=3 width=0) (actual time=0.141..0.142 rows=0 loops=1)
                                Index Cond: (author_id = 4092171)
                                Buffers: shared hit=2 read=1
                                I/O Timings: read=0.035
              ->  Nested Loop  (cost=1.28..63710.34 rows=1795 width=1080) (actual time=0.178..256.755 rows=4386 loops=1)
                    Buffers: shared hit=99907 read=277
                    I/O Timings: read=5.823
                    ->  Nested Loop  (cost=0.85..62317.06 rows=2236 width=1084) (actual time=0.126..183.700 rows=4559 loops=1)
                          Buffers: shared hit=81643 read=277
                          I/O Timings: read=5.823
                          ->  Index Scan using index_snippets_on_visibility_level on snippets snippets_1  (cost=0.42..37304.30 rows=16283 width=1080) (actual time=0.037..91.027 rows=53072 loops=1)
                                Index Cond: (visibility_level = 20)
                                Buffers: shared hit=52242
                          ->  Index Only Scan using index_projects_on_id_partial_for_visibility on projects  (cost=0.43..1.53 rows=1 width=4) (actual time=0.001..0.001 rows=0 loops=53072)
                                Index Cond: (id = snippets_1.project_id)
                                Heap Fetches: 2902
                                Buffers: shared hit=29401 read=277
                                I/O Timings: read=5.823
                    ->  Index Scan using index_project_features_on_project_id on project_features  (cost=0.43..0.61 rows=1 width=4) (actual time=0.015..0.016 rows=1 loops=4559)
                          Index Cond: (project_id = projects.id)
                          Filter: (snippets_access_level = ANY ('{20,30}'::integer[]))
                          Rows Removed by Filter: 0
                          Buffers: shared hit=18264
              ->  Nested Loop  (cost=44.07..387.45 rows=1 width=1080) (actual time=8.335..61.760 rows=362 loops=1)
                    Buffers: shared hit=15267 read=180
                    I/O Timings: read=19.575
                    ->  Nested Loop  (cost=43.63..386.97 rows=1 width=1088) (actual time=8.313..59.885 rows=362 loops=1)
                          Buffers: shared hit=13819 read=180
                          I/O Timings: read=19.575
                          ->  Nested Loop  (cost=43.21..329.28 rows=127 width=8) (actual time=3.738..34.837 rows=1846 loops=1)
                                Buffers: shared hit=8079 read=12
                                I/O Timings: read=2.295
                                ->  HashAggregate  (cost=42.78..44.05 rows=127 width=4) (actual time=3.695..4.565 rows=1846 loops=1)
                                      Group Key: project_authorizations.project_id
                                      Buffers: shared hit=68 read=12
                                      I/O Timings: read=2.295
                                      ->  Index Only Scan using index_project_authorizations_on_user_id_project_id_access_level on project_authorizations  (cost=0.56..42.46 rows=127 width=4) (actual time=0.085..3.270 rows=1846 loops=1)
                                            Index Cond: (user_id = 4092171)
                                            Heap Fetches: 145
                                            Buffers: shared hit=68 read=12
                                            I/O Timings: read=2.295
                                ->  Index Only Scan using projects_pkey on projects projects_1  (cost=0.43..2.24 rows=1 width=4) (actual time=0.015..0.016 rows=1 loops=1846)
                                      Index Cond: (id = project_authorizations.project_id)
                                      Heap Fetches: 702
                                      Buffers: shared hit=8011
                          ->  Index Scan using index_snippets_on_project_id_and_visibility_level on snippets snippets_2  (cost=0.42..0.44 rows=1 width=1080) (actual time=0.013..0.013 rows=0 loops=1846)
                                Index Cond: ((project_id = projects_1.id) AND (visibility_level = 20))
                                Buffers: shared hit=5740 read=168
                                I/O Timings: read=17.280
                    ->  Index Scan using index_project_features_on_project_id on project_features project_features_1  (cost=0.43..0.48 rows=1 width=4) (actual time=0.004..0.004 rows=1 loops=362)
                          Index Cond: (project_id = projects_1.id)
                          Filter: (snippets_access_level = ANY ('{20,30,10}'::integer[]))
                          Buffers: shared hit=1448
Planning time: 20.592 ms
Execution time: 14021.044 ms
After
SELECT "snippets".* FROM "snippets" WHERE "snippets"."project_id" IS NULL AND "snippets"."visibility_level" = 20 order by created_at desc

--- QUERY PLAN ---
Sort  (cost=35256.72..35347.86 rows=36458 width=1080) (actual time=174.320..197.876 rows=45568 loops=1)
  Sort Key: created_at DESC
  Sort Method: quicksort  Memory: 69900kB
  Buffers: shared hit=27722
  ->  Index Scan using index_snippets_on_project_id_and_visibility_level on snippets  (cost=0.42..32494.30 rows=36458 width=1080) (actual time=0.091..110.618 rows=45568 loops=1)
        Index Cond: ((project_id IS NULL) AND (visibility_level = 20))
        Buffers: shared hit=27719
Planning time: 5.619 ms
Execution time: 205.457 ms

I will run the SQL explain on staging again once the index has been deployed.

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Edited by 🤖 GitLab Bot 🤖

Merge request reports