Skip to content

Add project_id, id indexes to merge_requests & issues tables

What does this MR do?

This MR adds:

  • (target_project_id, id) index on merge_requests
  • (project_id, id) index on issues

Why?

While working on !67150 (merged) a great observation was made (!67150 (comment 643170505)) that the following queries are not performant:

  • project.merge_requests.where.not(id: [1, 2, 3]).find_each { ... }
SELECT "merge_requests".* FROM "merge_requests" WHERE "merge_requests"."target_project_id" = 1 AND "merge_requests"."id" NOT IN (1, 2, 3) ORDER BY "merge_requests"."id" ASC LIMIT 1000
  • project.issues.where.not(id: [1, 2, 3]).find_each { ... }
SELECT "issues".* FROM "issues" WHERE "issues"."project_id" = 1 AND "issues"."id" NOT IN (1, 2, 3) ORDER BY "issues"."id" ASC LIMIT 1000

This is mainly due to added ORDER BY "issues"."id" ASC LIMIT 1000 by find_each and not the where.not clause. In order to resolve this performance issue and allow iterating over batches of MRs/issues add 2 new indexes.

Merge requests index creation ran for 31 minutes in database-lab (https://gitlab.slack.com/archives/CLJMDRD8C/p1628161279130000)

Issues index creation ran for 25 minutes in database-lab (https://gitlab.slack.com/archives/CLJMDRD8C/p1628159490116400)

Mentions #332630 (closed)

Migration output & execution plans

db/migrate/20210805103231_add_index_merge_requests_on_target_project_id_and_id.rb

Up
== 20210805103231 AddIndexMergeRequestsOnTargetProjectIdAndId: migrating ======
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:merge_requests, [:target_project_id, :id], {:name=>"index_merge_requests_on_target_project_id_and_id", :algorithm=>:concurrently})
   -> 0.0172s
-- execute("SET statement_timeout TO 0")
   -> 0.0008s
-- add_index(:merge_requests, [:target_project_id, :id], {:name=>"index_merge_requests_on_target_project_id_and_id", :algorithm=>:concurrently})
   -> 0.0437s
-- execute("RESET ALL")
   -> 0.0006s
== 20210805103231 AddIndexMergeRequestsOnTargetProjectIdAndId: migrated (0.0647s)
Down
== 20210805103231 AddIndexMergeRequestsOnTargetProjectIdAndId: reverting ======
-- transaction_open?()
   -> 0.0000s
-- indexes(:merge_requests)
   -> 0.0187s
-- execute("SET statement_timeout TO 0")
   -> 0.0011s
-- remove_index(:merge_requests, {:algorithm=>:concurrently, :name=>"index_merge_requests_on_target_project_id_and_id"})
   -> 0.0067s
-- execute("RESET ALL")
   -> 0.0012s
== 20210805103231 AddIndexMergeRequestsOnTargetProjectIdAndId: reverted (0.0379s) 
Execution Plan
Limit  (cost=0.57..1234.47 rows=1000 width=764) (actual time=0.541..2883.509 rows=1000 loops=1)
   Buffers: shared hit=12 read=995 dirtied=5
   I/O Timings: read=2863.823 write=0.000
   ->  Index Scan using index_merge_requests_on_target_project_id_and_id on public.merge_requests  (cost=0.57..90189.21 rows=73092 width=764) (actual time=0.538..2882.724 rows=1000 loops=1)
         Index Cond: (merge_requests.target_project_id = 278964)
         Filter: (merge_requests.id <> ALL ('{1,2,3}'::integer[]))
         Rows Removed by Filter: 0
         Buffers: shared hit=12 read=995 dirtied=5
         I/O Timings: read=2863.823 write=0.000
Limit  (cost=0.57..1230.72 rows=1000 width=764) (actual time=0.041..2.953 rows=1000 loops=1)
   Buffers: shared hit=1007
   I/O Timings: read=0.000 write=0.000
   ->  Index Scan using index_merge_requests_on_target_project_id_and_id on public.merge_requests  (cost=0.57..89915.11 rows=73092 width=764) (actual time=0.039..2.819 rows=1000 loops=1)
         Index Cond: (merge_requests.target_project_id = 278964)
         Buffers: shared hit=1007
         I/O Timings: read=0.000 write=0.000

db/migrate/20210805102538_add_index_issues_on_project_id_and_id.rb

Up
== 20210805102538 AddIndexIssuesOnProjectIdAndId: migrating ===================
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:issues, [:project_id, :id], {:name=>"index_issues_on_project_id_and_id", :algorithm=>:concurrently})
   -> 0.0201s
-- execute("SET statement_timeout TO 0")
   -> 0.0007s
-- add_index(:issues, [:project_id, :id], {:name=>"index_issues_on_project_id_and_id", :algorithm=>:concurrently})
   -> 0.0719s
-- execute("RESET ALL")
   -> 0.0017s
== 20210805102538 AddIndexIssuesOnProjectIdAndId: migrated (0.0977s) ==========

Down
== 20210805102538 AddIndexIssuesOnProjectIdAndId: reverting ===================
-- transaction_open?()
   -> 0.0000s
-- indexes(:issues)
   -> 0.0150s
-- execute("SET statement_timeout TO 0")
   -> 0.0019s
-- remove_index(:issues, {:algorithm=>:concurrently, :name=>"index_issues_on_project_id_and_id"})
   -> 0.0048s
-- execute("RESET ALL")
   -> 0.0006s
== 20210805102538 AddIndexIssuesOnProjectIdAndId: reverted (0.0259s) ==========
Execution Plan
 Limit  (cost=0.57..1310.60 rows=1000 width=1326) (actual time=22.856..2832.857 rows=1000 loops=1)
   Buffers: shared hit=10 read=981 dirtied=2
   I/O Timings: read=2816.829 write=0.000
   ->  Index Scan using index_issues_on_project_id_and_id on public.issues  (cost=0.57..124509.92 rows=95043 width=1326) (actual time=22.853..2831.966 rows=1000 loops=1)
         Index Cond: (issues.project_id = 278964)
         Filter: (issues.id <> ALL ('{1,2,3}'::integer[]))
         Rows Removed by Filter: 0
         Buffers: shared hit=10 read=981 dirtied=2
         I/O Timings: read=2816.829 write=0.000
 Limit  (cost=0.57..1310.60 rows=1000 width=1326) (actual time=0.042..2.450 rows=1000 loops=1)
   Buffers: shared hit=991
   I/O Timings: read=0.000 write=0.000
   ->  Index Scan using index_issues_on_project_id_and_id on public.issues  (cost=0.57..124509.92 rows=95043 width=1326) (actual time=0.039..2.309 rows=1000 loops=1)
         Index Cond: (issues.project_id = 278964)
         Filter: (issues.id <> ALL ('{1,2,3}'::integer[]))
         Rows Removed by Filter: 0
         Buffers: shared hit=991
         I/O Timings: read=0.000 write=0.000

Screenshots or Screencasts (strongly suggested)

How to setup and validate locally (strongly suggested)

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

Does this MR contain changes to processing or storing of credentials or tokens, authorization and authentication methods or other items described in the security review guidelines? If not, then delete this Security section.

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by George Koltsov

Merge request reports