Improve the speed of SELECT .. FROM ci_trigger_requests WHERE trigger_id = .. ORDER BY id DESC LIMIT 1
Discussed in #33654 (closed) and https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/8141
Solution proposed + analysis on production and with Joe bot: https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/8141#note_230953342
The query
SELECT "ci_trigger_requests".*
FROM "ci_trigger_requests"
WHERE "ci_trigger_requests"."trigger_id" = XXX
ORDER BY "ci_trigger_requests"."id" DESC
LIMIT 1
Has bad execution plan -- we need to hit ~2M pages in the buffer pool to fetch just a single record (checked on GitLab.com production):
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..165.28 rows=1 width=78) (actual time=1486.453..1486.454 rows=1 loops=1)
Buffers: shared hit=2115978
-> Index Scan Backward using ci_trigger_requests_pkey on ci_trigger_requests (cost=0.43..168642.14 rows=1023 width=78) (actual time=1486.452..1486.452 rows=1 loops=1)
Filter: (trigger_id = MASKED)
Rows Removed by Filter: 2322519
Buffers: shared hit=2115978
Planning time: 0.105 ms
Execution time: 1486.475 ms
(8 rows)
Stats:
Shared buffers:
- hits: 1963853 (~15.00 GiB) from the buffer pool
- reads: 5974 (~46.70 MiB) from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
How to solve it: replace index "index_ci_trigger_requests_on_trigger_id" btree (trigger_id)
with extended one, "index_ci_trigger_requests_on_trigger_id_id_desc" btree (trigger_id, id desc)
. After this:
Limit (cost=0.43..1.76 rows=1 width=77) (actual time=0.059..0.060 rows=1 loops=1)
Buffers: shared hit=4
-> Index Scan using index_ci_trigger_requests_on_trigger_id_id_desc on public.ci_trigger_requests (cost=0.43..1809.79 rows=1357 width=77) (actual time=0.058..0.058 rows=1 loops=1)
Index Cond: (ci_trigger_requests.trigger_id = MASKED)
Buffers: shared hit=4
Shared buffers:
- hits: 4 (~32.00 KiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
/cc @gl-database
Edited by Nikolay Samokhvalov