Skip to content

Optimize StuckCiJobsWorker Pending build Queries because they timeout

Allison Browne requested to merge 291010-stuckcijobsworker-always-times-out into master

What does this MR do?

Use created_at in order to utilize an index scan to improve query performance.

Related to #291010 (closed)

Screenshots (strongly suggested)

Database

Original query

explain analyze 
SELECT * FROM "ci_builds" 
WHERE "ci_builds"."type" = 'Ci::Build' 
  AND "ci_builds"."status" = 'pending' 
  AND updated_at < '2021-05-26 18:14:29'
LIMIT 100;
 Limit  (cost=0.58..145.72 rows=100 width=1278) (actual time=0.249..66.954 rows=100 loops=1)
   ->  Index Scan using ci_builds_gitlab_monitor_metrics on ci_builds  (cost=0.58..659531.50 rows=454412 width=1278) (actual time=0.247..66.928 rows=100 loo
ps=1)
         Index Cond: ((status)::text = 'pending'::text)
         Filter: (updated_at < '2021-05-26 18:14:29'::timestamp without time zone)
 Planning Time: 0.553 ms
 Execution Time: 67.016 ms

New Query

explain 
SELECT * 
FROM "ci_builds" 
WHERE "ci_builds"."type" = 'Ci::Build' 
  AND "ci_builds"."status" = 'pending' 
  ci_builds.created_at BETWEEN '2021-05-29 14:16:10 UTC' AND '2021-06-02 14:15:16 UTC'
  AND BETWEEN '2021-05-29 14:16:10 UTC' AND '2021-06-02 14:15:16 UTC'
limit 100;
 Limit  (cost=0.58..4608.22 rows=21 width=1283) (actual time=334.841..602.886 rows=100 loops=1)
   Buffers: shared hit=1414 read=13363 dirtied=2666
   I/O Timings: read=561.583 write=0.000
   ->  Index Scan using ci_builds_gitlab_monitor_metrics on public.ci_builds  (cost=0.58..4608.22 rows=21 width=1283) (actual time=334.838..602.847 rows=100 loops=1)
         Index Cond: (((ci_builds.status)::text = 'pending'::text) AND (ci_builds.created_at >= '2021-05-29 14:16:10'::timestamp without time zone) AND (ci_builds.created_at <= '2021-06-02 14:15:16'::timestamp without time zone))
         Filter: ((ci_builds.updated_at >= '2021-05-29 14:16:10'::timestamp without time zone) AND (ci_builds.updated_at <= '2021-06-02 14:15:16'::timestamp without time zone))
         Rows Removed by Filter: 1421
         Buffers: shared hit=1414 read=13363 dirtied=2666
         I/O Timings: read=561.583 write=0.000

cold cache

Time: 624.275 ms
  - planning: 21.314 ms
  - execution: 602.961 ms
    - I/O read: 561.583 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 1414 (~11.00 MiB) from the buffer pool
  - reads: 13363 (~104.40 MiB) from the OS file cache, including disk I/O
  - dirtied: 2666 (~20.80 MiB)
  - writes: 0

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/4440/commands/15505

warm cache

Time: 17.800 ms  
  - planning: 0.589 ms  
  - execution: 17.211 ms  
    - I/O read: 0.000 ms  
    - I/O write: 0.000 ms  
  
Shared buffers:  
  - hits: 9901 (~77.40 MiB) from the buffer pool  
  - reads: 0 from the OS file cache, including disk I/O  
  - dirtied: 0  
  - writes: 0  

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/4440/commands/15538

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Edited by Allison Browne

Merge request reports