Skip to content

Periodically reconcile ci_runner_versions

What does this MR do and why?

Describe in detail what your merge request does and why.

This MR adds a cron job that runs every hour to ensure that the new ci_runner_versions table is correctly populated based on the distinct values of ci_runners.version and the calculated values of Gitlab::Ci::RunnerUpgradeCheck#check_runner_upgrade_status for each version.

In the future, we may want to react to changes in Ci::Runner#version in order to react more quickly instead of a fixed hourly schedule.

Part of GraphQL: Expose filter on GitLab Runner upgrade... (#358406 - closed)

Related MRs

Screenshots or screen recordings

These are strongly recommended to assist reviewers and reduce the time to merge your change.

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

  1. Open the GDK console
  2. Run Ci::Runners::ReconcileExistingRunnerVersionsService.new.execute
  3. You should see the results in the form of {:total_inserted=>0, :total_updated=>0, :total_deleted=>0, :status=>:success}. Make sure that the returned values make sense for the state of your instance. Running a second time should return 0 for all hash total_* values.

Database plans

Check if runners exist (/app/models/concerns/bulk_insert_safe.rb:138)
SELECT 1 AS one
FROM "ci_runners"
LIMIT 1
 Limit  (cost=0.00..0.38 rows=1 width=4) (actual time=12.213..12.216 rows=1 loops=1)
   Buffers: shared read=1 dirtied=1
   I/O Timings: read=9.402 write=0.000
   ->  Seq Scan on public.ci_runners  (cost=0.00..649728.87 rows=1704087 width=4) (actual time=12.210..12.211 rows=1 loops=1)
         Buffers: shared read=1 dirtied=1
         I/O Timings: read=9.402 write=0.000

https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/10848/commands/38991

Getting distinct versions from ci_runners (/app/models/concerns/bulk_insert_safe.rb:157)
SELECT DISTINCT "ci_runners"."version"
FROM "ci_runners"
 Unique  (cost=0.43..44379.97 rows=726 width=7) (actual time=0.164..842.805 rows=1288 loops=1)
   Buffers: shared hit=73065 read=8226
   I/O Timings: read=213.883 write=0.000
   ->  Index Only Scan using index_ci_runners_on_version on public.ci_runners  (cost=0.43..40520.25 rows=1543885 width=7) (actual time=0.163..558.817 rows=1543885 loops=1)
         Heap Fetches: 108786
         Buffers: shared hit=73065 read=8226
         I/O Timings: read=213.883 write=0.000

https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/10848/commands/38994

Insert new versions into ci_runner_versions (/app/models/concerns/bulk_insert_safe.rb:163)
INSERT INTO "ci_runner_versions" ("version", "status")
  VALUES ('14.10.0', NULL), ('14.10.0~beta.35.g0cb1a1ef', NULL), ('14.10.1', NULL), ('14.11.0~beta.29.gd0c550e3', NULL), ('development version', NULL)
ON CONFLICT
  DO NOTHING
RETURNING "version"
 ModifyTable on public.ci_runner_versions  (cost=0.00..0.06 rows=5 width=34) (actual time=6.747..6.893 rows=5 loops=1)
   Buffers: shared hit=39 read=3 dirtied=7 written=4
   I/O Timings: read=0.236 write=0.000
   ->  Values Scan on "*VALUES*"  (cost=0.00..0.06 rows=5 width=34) (actual time=0.004..0.012 rows=5 loops=1)
         I/O Timings: read=0.000 write=0.000

https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/10848/commands/38999

Deleting stale versions (reconcile_existing_runner_versions_service.rb:47)
DELETE FROM "ci_runner_versions"
WHERE "ci_runner_versions"."version" NOT IN ( SELECT DISTINCT "ci_runners"."version"
    FROM "ci_runners")
 ModifyTable on public.ci_runner_versions  (cost=44381.78..44433.88 rows=644 width=6) (actual time=671.158..671.161 rows=0 loops=1)
   Buffers: shared hit=81300
   I/O Timings: read=0.000 write=0.000
   ->  Seq Scan on public.ci_runner_versions  (cost=44381.78..44433.88 rows=644 width=6) (actual time=671.156..671.158 rows=0 loops=1)
         Filter: (NOT (hashed SubPlan 1))
         Rows Removed by Filter: 1288
         Buffers: shared hit=81300
         I/O Timings: read=0.000 write=0.000
         SubPlan 1
           ->  Unique  (cost=0.43..44379.97 rows=726 width=7) (actual time=0.047..668.919 rows=1288 loops=1)
                 Buffers: shared hit=81291
                 I/O Timings: read=0.000 write=0.000
                 ->  Index Only Scan using index_ci_runners_on_version on public.ci_runners  (cost=0.43..40520.25 rows=1543885 width=7) (actual time=0.046..385.292 rows=1543885 loops=1)
                       Heap Fetches: 108786
                       Buffers: shared hit=81291
                       I/O Timings: read=0.000 write=0.000

https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/10848/commands/39001

NOTE: This query contains a SeqScan, but the table of unique runner versions is bound to be a small table (< 2000 items in .com), so it should not be a problem.

Search for potential stale ci_runner_versions rows - step 1 (/app/models/concerns/each_batch.rb:62)
SELECT "ci_runner_versions"."version"
FROM "ci_runner_versions"
WHERE ("ci_runner_versions"."status" IN (1, 2, 0)
  OR "ci_runner_versions"."status" IS NULL)
ORDER BY "ci_runner_versions"."version" ASC
LIMIT 1
 Limit  (cost=0.28..0.33 rows=1 width=21) (actual time=0.037..0.038 rows=1 loops=1)
   Buffers: shared hit=3
   I/O Timings: read=0.000 write=0.000
   ->  Index Scan using index_ci_runner_versions_on_version on public.ci_runner_versions  (cost=0.28..70.66 rows=1288 width=21) (actual time=0.035..0.035 rows=1 loops=1)
         Filter: ((ci_runner_versions.status = ANY ('{1,2,0}'::integer[])) OR (ci_runner_versions.status IS NULL))
         Rows Removed by Filter: 0
         Buffers: shared hit=3
         I/O Timings: read=0.000 write=0.000

https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/10848/commands/39002

Search for potential stale ci_runner_versions rows - step 2 (first window of 2000 versions, /app/models/concerns/each_batch.rb:81)
SELECT "ci_runner_versions"."version"
FROM "ci_runner_versions"
WHERE ("ci_runner_versions"."status" IN (1, 2, 0)
  OR "ci_runner_versions"."status" IS NULL)
AND "ci_runner_versions"."version" >= '14.10.1'
ORDER BY "ci_runner_versions"."version" ASC
LIMIT 1 OFFSET 2000
 Limit  (cost=14.52..14.58 rows=1 width=21) (actual time=0.210..0.211 rows=0 loops=1)
   Buffers: shared hit=14
   I/O Timings: read=0.000 write=0.000
   ->  Index Scan using index_ci_runner_versions_on_version on public.ci_runner_versions  (cost=0.28..14.52 rows=256 width=21) (actual time=0.078..0.191 rows=256 loops=1)
         Index Cond: (ci_runner_versions.version >= '14.10.1'::text)
         Filter: ((ci_runner_versions.status = ANY ('{1,2,0}'::integer[])) OR (ci_runner_versions.status IS NULL))
         Rows Removed by Filter: 0
         Buffers: shared hit=14
         I/O Timings: read=0.000 write=0.000

https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/10848/commands/39003

Search for potential stale ci_runner_versions rows - step 3 (reconcile_existing_runner_versions_service.rb:55)
SELECT "ci_runner_versions".*
FROM "ci_runner_versions"
WHERE ("ci_runner_versions"."status" IN (1, 2, 0)
  OR "ci_runner_versions"."status" IS NULL)
AND "ci_runner_versions"."version" >= '14.10.1'
 Index Scan using index_ci_runner_versions_on_version on public.ci_runner_versions  (cost=0.28..14.52 rows=256 width=23) (actual time=0.086..0.226 rows=256 loops=1)
   Index Cond: (ci_runner_versions.version >= '14.10.1'::text)
   Filter: ((ci_runner_versions.status = ANY ('{1,2,0}'::integer[])) OR (ci_runner_versions.status IS NULL))
   Rows Removed by Filter: 0
   Buffers: shared hit=14
   I/O Timings: read=0.000 write=0.000

https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/10848/commands/39004

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Merge request reports