Skip to content

Add index to ci_pipeline_metadata

Reuben Pereira requested to merge rp/add-better-pipeline-name-index into master

What does this MR do and why?

Add index on [name, pipeline_id] to ci_pipeline_metadata table.

This index shows much better query performance than the existing index on [pipeline_id, name]. The existing index can be removed in a follow-up (!140608 (merged)).

The SQL query is behind a feature flag (#385864 (closed)) currently. Once its performance has been improved with the new index being added by this MR, the FF will be removed.

  • Query performance with the old index: https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/24974/commands/79318

    Click to expand
    Limit  (cost=1001.03..106953.03 rows=6 width=361) (actual time=148.073..165.281 rows=20 loops=1)
     Buffers: shared hit=23887
     I/O Timings: read=0.000 write=0.000
     ->  Gather Merge  (cost=1001.03..106953.03 rows=6 width=361) (actual time=148.070..165.270 rows=20 loops=1)
           Workers Planned: 2
           Workers Launched: 2
           Buffers: shared hit=23887
           I/O Timings: read=0.000 write=0.000
           ->  Nested Loop  (cost=1.01..105952.32 rows=2 width=361) (actual time=140.777..146.944 rows=9 loops=3)
                 Buffers: shared hit=23887
                 I/O Timings: read=0.000 write=0.000
                 ->  Parallel Index Only Scan Backward using index_pipeline_metadata_on_pipeline_id_name_text_pattern on public.ci_pipeline_metadata  (cost=0.43..99988.91 rows=1658 width=8) (actual time=140.635..146.600 rows=9 loops=3)
                       Index Cond: (ci_pipeline_metadata.name = 'Ruby 3.0 merged_result MR pipeline (community contribution)'::text)
                       Heap Fetches: 0
                       Buffers: shared hit=23746
                       I/O Timings: read=0.000 write=0.000
                 ->  Index Scan using index_ci_pipelines_on_project_id_and_id_desc on public.ci_pipelines  (cost=0.57..3.60 rows=1 width=361) (actual time=0.035..0.035 rows=1 loops=26)
                       Index Cond: ((ci_pipelines.project_id = 278964) AND (ci_pipelines.id = ci_pipeline_metadata.pipeline_id))
                       Filter: (ci_pipelines.source <> 12)
                       Rows Removed by Filter: 0
                       Buffers: shared hit=141
                       I/O Timings: read=0.000 write=0.000
    Time: 177.596 ms
      - planning: 12.109 ms
      - execution: 165.487 ms
      - I/O read: 0.000 ms
      - I/O write: 0.000 ms
    
    Shared buffers:
      - hits: 23887 (~186.60 MiB) from the buffer pool
      - reads: 0 from the OS file cache, including disk I/O
      - dirtied: 0
      - writes: 0
  • Query performance with the new index: https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/24980/commands/79384

    Click to expand
     Limit  (cost=1.13..15724.35 rows=7 width=361) (actual time=0.287..0.588 rows=20 loops=1)
     Buffers: shared hit=104 read=4
     I/O Timings: read=0.110 write=0.000
     ->  Nested Loop  (cost=1.13..15724.35 rows=7 width=361) (actual time=0.286..0.584 rows=20 loops=1)
           Buffers: shared hit=104 read=4
           I/O Timings: read=0.110 write=0.000
           ->  Index Only Scan Backward using index_pipeline_metadata_on_name_pipeline_id_text_pattern on public.ci_pipeline_metadata  (cost=0.56..140.88 rows=4333 width=8) (actual time=0.171..0.174 rows=20 loops=1)
                 Index Cond: (ci_pipeline_metadata.name = 'Ruby 3.0 merged_result MR pipeline (community contribution)'::text)
                 Heap Fetches: 0
                 Buffers: shared hit=1 read=4
                 I/O Timings: read=0.110 write=0.000
           ->  Index Scan using index_ci_pipelines_on_project_id_and_id_desc on public.ci_pipelines  (cost=0.57..3.60 rows=1 width=361) (actual time=0.020..0.020 rows=1 loops=20)
                 Index Cond: ((ci_pipelines.project_id = 278964) AND (ci_pipelines.id = ci_pipeline_metadata.pipeline_id))
                 Filter: (ci_pipelines.source <> 12)
                 Rows Removed by Filter: 0
                 Buffers: shared hit=103
                 I/O Timings: read=0.000 write=0.000
    Time: 11.237 ms
      - planning: 10.518 ms
      - execution: 0.719 ms
      - I/O read: 0.110 ms
      - I/O write: 0.000 ms
    
    Shared buffers:
      - hits: 104 (~832.00 KiB) from the buffer pool
      - reads: 4 (~32.00 KiB) from the OS file cache, including disk I/O
      - dirtied: 0
      - writes: 0

Migration output:

main: == [advisory_lock_connection] object_id: 182860, pg_backend_pid: 132
main: == 20240115112056 AddIndexOnPipelineMetadata: migrating =======================
main: -- index_name_exists?(:ci_pipeline_metadata, "index_pipeline_metadata_on_name_text_pattern_pipeline_id")
main:    -> 0.0208s
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main:    -> 0.0013s
main: -- index_exists?(:ci_pipeline_metadata, "name text_pattern_ops, pipeline_id", {:name=>"index_pipeline_metadata_on_name_text_pattern_pipeline_id", :algorithm=>:concurrently})
main:    -> 0.0083s
main: -- add_index(:ci_pipeline_metadata, "name text_pattern_ops, pipeline_id", {:name=>"index_pipeline_metadata_on_name_text_pattern_pipeline_id", :algorithm=>:concurrently})
main:    -> 0.0107s
main: == 20240115112056 AddIndexOnPipelineMetadata: migrated (0.0635s) ==============

main: == [advisory_lock_connection] object_id: 182860, pg_backend_pid: 132
ci: == [advisory_lock_connection] object_id: 183100, pg_backend_pid: 134
ci: == 20240115112056 AddIndexOnPipelineMetadata: migrating =======================
ci: -- index_name_exists?(:ci_pipeline_metadata, "index_pipeline_metadata_on_name_text_pattern_pipeline_id")
ci:    -> 0.0025s
ci: -- transaction_open?(nil)
ci:    -> 0.0000s
ci: -- view_exists?(:postgres_partitions)
ci:    -> 0.0011s
ci: -- index_exists?(:ci_pipeline_metadata, "name text_pattern_ops, pipeline_id", {:name=>"index_pipeline_metadata_on_name_text_pattern_pipeline_id", :algorithm=>:concurrently})
ci:    -> 0.0067s
ci: -- add_index(:ci_pipeline_metadata, "name text_pattern_ops, pipeline_id", {:name=>"index_pipeline_metadata_on_name_text_pattern_pipeline_id", :algorithm=>:concurrently})
ci:    -> 0.0087s
ci: == 20240115112056 AddIndexOnPipelineMetadata: migrated (0.0426s) ==============

Down migration:

main: == [advisory_lock_connection] object_id: 182460, pg_backend_pid: 152
main: == 20240115112056 AddIndexOnPipelineMetadata: reverting =======================
main: -- index_name_exists?(:ci_pipeline_metadata, "index_pipeline_metadata_on_name_text_pattern_pipeline_id")
main:    -> 0.0105s
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main:    -> 0.0021s
main: -- indexes(:ci_pipeline_metadata)
main:    -> 0.0088s
main: -- remove_index(:ci_pipeline_metadata, {:algorithm=>:concurrently, :name=>"index_pipeline_metadata_on_name_text_pattern_pipeline_id"})
main:    -> 0.0037s
main: == 20240115112056 AddIndexOnPipelineMetadata: reverted (0.0490s) ==============

ci: == [advisory_lock_connection] object_id: 182500, pg_backend_pid: 169
ci: == 20240115112056 AddIndexOnPipelineMetadata: reverting =======================
ci: -- index_name_exists?(:ci_pipeline_metadata, "index_pipeline_metadata_on_name_text_pattern_pipeline_id")
ci:    -> 0.0077s
ci: -- transaction_open?(nil)
ci:    -> 0.0000s
ci: -- view_exists?(:postgres_partitions)
ci:    -> 0.0013s
ci: -- indexes(:ci_pipeline_metadata)
ci:    -> 0.0079s
ci: -- remove_index(:ci_pipeline_metadata, {:algorithm=>:concurrently, :name=>"index_pipeline_metadata_on_name_text_pattern_pipeline_id"})
ci:    -> 0.0059s
ci: == 20240115112056 AddIndexOnPipelineMetadata: reverted (0.0525s) ==============

#385864 (closed)

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

Edited by Reuben Pereira

Merge request reports