Skip to content

Add root_namespace_id to ci_finished_builds ClickHouse table

What does this MR do and why?

This MR adds a root_namespace_id column to the ci_finished_builds CH table. It also populates it when syncing. The goal is to run these migrations manually on https://clickhouse.cloud as we merge this MR, so that the schema is ready once this hits production.

This MR is the second part of #426113 (closed). Once !140958 (merged) and this MR get merged, the only missing column from the ones suggested in the issue will be the compute_minutes_usage column.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

image

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

n/a

Database query plans

Current query (batch of 500)

https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/25062/commands/79603

SELECT "p_ci_builds"."finished_at", "p_ci_builds"."id", "p_ci_builds"."project_id", "p_ci_builds"."commit_id",
  "p_ci_builds"."status", "p_ci_builds"."name", "p_ci_builds"."stage", "p_ci_builds"."runner_id",
  EXTRACT(epoch FROM p_ci_builds.created_at) AS casted_created_at, EXTRACT(epoch FROM p_ci_builds.queued_at) AS
    casted_queued_at, EXTRACT(epoch FROM p_ci_builds.started_at) AS casted_started_at, EXTRACT(epoch FROM
    p_ci_builds.finished_at) AS casted_finished_at, ci_runners.run_untagged AS runner_run_untagged,
    ci_runners.runner_type AS runner_type, ci_runner_machines.system_xid AS runner_manager_system_xid,
  ci_runner_machines.version AS runner_manager_version, ci_runner_machines.revision AS runner_manager_revision,
    ci_runner_machines.platform AS runner_manager_platform, ci_runner_machines.architecture AS
    runner_manager_architecture
FROM "p_ci_builds"
  LEFT OUTER JOIN "ci_runners" ON "ci_runners"."id" = "p_ci_builds"."runner_id"
  LEFT OUTER JOIN "p_ci_runner_machine_builds" ON "p_ci_runner_machine_builds"."build_id" = "p_ci_builds"."id"
  LEFT OUTER JOIN "ci_runner_machines" ON "ci_runner_machines"."id" = "p_ci_runner_machine_builds"."runner_machine_id"
WHERE "p_ci_builds"."type" = 'Ci::Build'
  AND "p_ci_builds"."id" IN (/* batch of 500 build ids */)
 Gather  (cost=1001.99..9591.22 rows=920 width=232) (actual time=2.773..24.310 rows=500 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   Buffers: shared hit=13656
   I/O Timings: read=0.000 write=0.000
   ->  Nested Loop Left Join  (cost=1.99..8499.22 rows=383 width=232) (actual time=0.888..9.245 rows=167 loops=3)
         Buffers: shared hit=13656
         I/O Timings: read=0.000 write=0.000
         ->  Nested Loop Left Join  (cost=1.57..7310.06 rows=383 width=98) (actual time=0.735..7.444 rows=167 loops=3)
               Buffers: shared hit=11678
               I/O Timings: read=0.000 write=0.000
               ->  Nested Loop Left Join  (cost=1.00..4553.71 rows=383 width=90) (actual time=0.616..5.470 rows=167 loops=3)
                     Buffers: shared hit=7180
                     I/O Timings: read=0.000 write=0.000
                     ->  Parallel Append  (cost=0.57..3241.14 rows=384 width=87) (actual time=0.576..3.891 rows=167 loops=3)
                           Buffers: shared hit=5202
                           I/O Timings: read=0.000 write=0.000
                           ->  Parallel Index Scan using ci_builds_pkey on public.ci_builds p_ci_builds_1  (cost=0.71..1776.31 rows=276 width=86) (actual time=1.189..1.189 rows=0 loops=1)
                                 Index Cond: (p_ci_builds_1.id = ANY ('{/* batch of 500 build ids */}'::bigint[]))
                                 Filter: ((p_ci_builds_1.type)::text = 'Ci::Build'::text)
                                 Rows Removed by Filter: 0
                                 Buffers: shared hit=2501
                                 I/O Timings: read=0.000 write=0.000
                           ->  Parallel Index Scan using ci_builds_101_pkey on gitlab_partitions_dynamic.ci_builds_101 p_ci_builds_2  (cost=0.57..1462.90 rows=265 width=88) (actual time=0.177..3.456 rows=167 loops=3)
                                 Index Cond: (p_ci_builds_2.id = ANY ('{/* batch of 500 build ids */}'::bigint[]))
                                 Filter: ((p_ci_builds_2.type)::text = 'Ci::Build'::text)
                                 Rows Removed by Filter: 0
                                 Buffers: shared hit=2701
                                 I/O Timings: read=0.000 write=0.000
                     ->  Index Scan using ci_runners_pkey on public.ci_runners  (cost=0.43..3.42 rows=1 width=7) (actual time=0.009..0.009 rows=1 loops=500)
                           Index Cond: (ci_runners.id = p_ci_builds.runner_id)
                           Buffers: shared hit=1978
                           I/O Timings: read=0.000 write=0.000
               ->  Append  (cost=0.57..7.18 rows=2 width=16) (actual time=0.011..0.011 rows=1 loops=500)
                     Buffers: shared hit=4498
                     I/O Timings: read=0.000 write=0.000
                     ->  Index Scan using ci_runner_machine_builds_100_pkey on gitlab_partitions_dynamic.ci_runner_machine_builds_100 p_ci_runner_machine_builds_1  (cost=0.57..3.59 rows=1 width=16) (actual time=0.003..0.003 rows=0 loops=500)
                           Index Cond: (p_ci_runner_machine_builds_1.build_id = p_ci_builds.id)
                           Buffers: shared hit=2002
                           I/O Timings: read=0.000 write=0.000
                     ->  Index Scan using ci_runner_machine_builds_101_pkey on gitlab_partitions_dynamic.ci_runner_machine_builds_101 p_ci_runner_machine_builds_2  (cost=0.56..3.58 rows=1 width=16) (actual time=0.007..0.007 rows=1 loops=500)
                           Index Cond: (p_ci_runner_machine_builds_2.build_id = p_ci_builds.id)
                           Buffers: shared hit=2496
                           I/O Timings: read=0.000 write=0.000
         ->  Index Scan using ci_runner_machines_pkey on public.ci_runner_machines  (cost=0.42..3.09 rows=1 width=46) (actual time=0.008..0.008 rows=1 loops=500)
               Index Cond: (ci_runner_machines.id = p_ci_runner_machine_builds.runner_machine_id)
               Buffers: shared hit=1978
               I/O Timings: read=0.000 write=0.000
New query (batch of 500)

https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/25062/commands/79604

SELECT "p_ci_builds"."finished_at", "p_ci_builds"."id", "p_ci_builds"."project_id", "p_ci_builds"."commit_id",
  "p_ci_builds"."status", "p_ci_builds"."name", "p_ci_builds"."stage", "p_ci_builds"."runner_id",
  EXTRACT(epoch FROM p_ci_builds.created_at) AS casted_created_at, EXTRACT(epoch FROM p_ci_builds.queued_at) AS
    casted_queued_at, EXTRACT(epoch FROM p_ci_builds.started_at) AS casted_started_at, EXTRACT(epoch FROM
    p_ci_builds.finished_at) AS casted_finished_at, ci_namespace_mirrors.traversal_ids[1] AS root_namespace_id,
    ci_runners.run_untagged AS runner_run_untagged, ci_runners.runner_type AS runner_type,
  ci_runner_machines.system_xid AS runner_manager_system_xid, ci_runner_machines.version AS runner_manager_version,
    ci_runner_machines.revision AS runner_manager_revision, ci_runner_machines.platform AS runner_manager_platform,
  ci_runner_machines.architecture AS runner_manager_architecture
FROM "p_ci_builds"
  LEFT OUTER JOIN "ci_runners" ON "ci_runners"."id" = "p_ci_builds"."runner_id"
  LEFT OUTER JOIN "p_ci_runner_machine_builds" ON "p_ci_runner_machine_builds"."build_id" = "p_ci_builds"."id"
  LEFT OUTER JOIN "ci_runner_machines" ON "ci_runner_machines"."id" = "p_ci_runner_machine_builds"."runner_machine_id"
  LEFT OUTER JOIN ci_project_mirrors ON ci_project_mirrors.project_id = p_ci_builds.project_id
  LEFT OUTER JOIN ci_namespace_mirrors ON ci_namespace_mirrors.namespace_id = ci_project_mirrors.namespace_id
WHERE "p_ci_builds"."type" = 'Ci::Build'
  AND "p_ci_builds"."id" IN (/* batch of 500 build ids */)
 Gather  (cost=1003.12..12199.13 rows=920 width=236) (actual time=3.144..27.958 rows=500 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   Buffers: shared hit=18778 read=1
   I/O Timings: read=0.905 write=0.000
   ->  Nested Loop Left Join  (cost=3.12..11107.13 rows=383 width=236) (actual time=0.789..13.152 rows=167 loops=3)
         Buffers: shared hit=18778 read=1
         I/O Timings: read=0.905 write=0.000
         ->  Nested Loop Left Join  (cost=2.55..9863.02 rows=383 width=132) (actual time=0.651..10.901 rows=167 loops=3)
               Buffers: shared hit=16276 read=1
               I/O Timings: read=0.905 write=0.000
               ->  Nested Loop Left Join  (cost=1.99..8495.63 rows=383 width=128) (actual time=0.613..9.139 rows=167 loops=3)
                     Buffers: shared hit=13774 read=1
                     I/O Timings: read=0.905 write=0.000
                     ->  Nested Loop Left Join  (cost=1.57..7310.06 rows=383 width=98) (actual time=0.554..7.745 rows=167 loops=3)
                           Buffers: shared hit=11796 read=1
                           I/O Timings: read=0.905 write=0.000
                           ->  Nested Loop Left Join  (cost=1.00..4553.71 rows=383 width=90) (actual time=0.463..5.880 rows=167 loops=3)
                                 Buffers: shared hit=7298 read=1
                                 I/O Timings: read=0.905 write=0.000
                                 ->  Parallel Append  (cost=0.57..3241.14 rows=384 width=87) (actual time=0.418..4.484 rows=167 loops=3)
                                       Buffers: shared hit=5320 read=1
                                       I/O Timings: read=0.905 write=0.000
                                       ->  Parallel Index Scan using ci_builds_pkey on public.ci_builds p_ci_builds_1  (cost=0.71..1776.31 rows=276 width=86) (actual time=1.021..1.021 rows=0 loops=1)
                                             Index Cond: (p_ci_builds_1.id = ANY ('{/* batch of 500 build ids */}'::bigint[]))
                                             Filter: ((p_ci_builds_1.type)::text = 'Ci::Build'::text)
                                             Rows Removed by Filter: 0
                                             Buffers: shared hit=2501
                                             I/O Timings: read=0.000 write=0.000
                                       ->  Parallel Index Scan using ci_builds_101_pkey on gitlab_partitions_dynamic.ci_builds_101 p_ci_builds_2  (cost=0.57..1462.90 rows=265 width=88) (actual time=0.076..4.108 rows=167 loops=3)
                                             Index Cond: (p_ci_builds_2.id = ANY ('{/* batch of 500 build ids */}'::bigint[]))
                                             Filter: ((p_ci_builds_2.type)::text = 'Ci::Build'::text)
                                             Rows Removed by Filter: 0
                                             Buffers: shared hit=2819 read=1
                                             I/O Timings: read=0.905 write=0.000
                                 ->  Index Scan using ci_runners_pkey on public.ci_runners  (cost=0.43..3.42 rows=1 width=7) (actual time=0.008..0.008 rows=1 loops=500)
                                       Index Cond: (ci_runners.id = p_ci_builds.runner_id)
                                       Buffers: shared hit=1978
                                       I/O Timings: read=0.000 write=0.000
                           ->  Append  (cost=0.57..7.18 rows=2 width=16) (actual time=0.010..0.011 rows=1 loops=500)
                                 Buffers: shared hit=4498
                                 I/O Timings: read=0.000 write=0.000
                                 ->  Index Scan using ci_runner_machine_builds_100_pkey on gitlab_partitions_dynamic.ci_runner_machine_builds_100 p_ci_runner_machine_builds_1  (cost=0.57..3.59 rows=1 width=16) (actual time=0.003..0.003 rows=0 loops=500)
                                       Index Cond: (p_ci_runner_machine_builds_1.build_id = p_ci_builds.id)
                                       Buffers: shared hit=2002
                                       I/O Timings: read=0.000 write=0.000
                                 ->  Index Scan using ci_runner_machine_builds_101_pkey on gitlab_partitions_dynamic.ci_runner_machine_builds_101 p_ci_runner_machine_builds_2  (cost=0.56..3.58 rows=1 width=16) (actual time=0.007..0.007 rows=1 loops=500)
                                       Index Cond: (p_ci_runner_machine_builds_2.build_id = p_ci_builds.id)
                                       Buffers: shared hit=2496
                                       I/O Timings: read=0.000 write=0.000
                     ->  Index Scan using ci_runner_machines_pkey on public.ci_runner_machines  (cost=0.42..3.10 rows=1 width=46) (actual time=0.008..0.008 rows=1 loops=500)
                           Index Cond: (ci_runner_machines.id = p_ci_runner_machine_builds.runner_machine_id)
                           Buffers: shared hit=1978
                           I/O Timings: read=0.000 write=0.000
               ->  Index Scan using index_ci_project_mirrors_on_project_id on public.ci_project_mirrors  (cost=0.56..3.57 rows=1 width=8) (actual time=0.010..0.010 rows=1 loops=500)
                     Index Cond: (ci_project_mirrors.project_id = p_ci_builds.project_id)
                     Buffers: shared hit=2502
                     I/O Timings: read=0.000 write=0.000
         ->  Index Scan using index_ci_namespace_mirrors_on_namespace_id on public.ci_namespace_mirrors  (cost=0.56..3.24 rows=1 width=32) (actual time=0.011..0.011 rows=1 loops=500)
               Index Cond: (ci_namespace_mirrors.namespace_id = ci_project_mirrors.namespace_id)
               Buffers: shared hit=2502
               I/O Timings: read=0.000 write=0.000

Merge request reports