Geo: Create geo_ci_job_artifact_verification_summaries table (!224136) · Merge requests · GitLab.org / GitLab

What does this MR do and why?

Adds geo_ci_job_artifact_verification_summaries table for improved metrics counting.

Problem

Geo currently collects job artifact verification metrics by scanning the entire ci_job_artifact_states table every minute. This approach doesn't scale for tables with 100M+ rows, causing timeouts and excessive database load.

Solution

This MR introduces a bucketed verification summaries table as the foundation for a more scalable metrics collection approach. Instead of counting the full table, we'll pre-calculate counts per bucket and only recalculate buckets that have changed.

How it works:

Job artifacts are assigned to buckets using job_artifact_id % 100,000
Each bucket stores pre-calculated total_count, verified_count, and failed_count
Buckets have a state machine: clean → dirty → calculating → clean
A future DB trigger (MR2/3) will mark buckets as dirty when artifacts change
A calculator worker will recalculate only dirty buckets

Why 100k buckets:

With 100M artifacts, each bucket averages ~1000 rows - fast to recount
Modulo distribution spreads new records evenly (unlike range-based which creates hotspots)
Bucket count is fixed; no need to add buckets as data grows

Why CI database (`gitlab_ci_cell_local`):

The source table ci_job_artifact_states is on the CI database
PostgreSQL triggers cannot write across databases
Model inherits from Ci::ApplicationRecord to route correctly

This is MR 1 of 3. Follow-up MRs will:

MR 2 - add the backfill service & calculator worker (behind FF) - #590853 (closed)
MR 3- DB trigger, and feature-flagged switch in the replicator - #590854

Query Plans database

Seeded data in postgres.ai (CI database):

Created the table, indexes, and sequence matching the migration, then seeded 100k rows across 3 states:

90k clean (state = 0)
5k dirty (state = 1)
5k calculating (state = 2)

Raw SQL and query plans:

Dirty scope:

SELECT "geo_ci_job_artifact_verification_summaries".*
FROM "geo_ci_job_artifact_verification_summaries"
WHERE "geo_ci_job_artifact_verification_summaries"."state" = 1
ORDER BY "geo_ci_job_artifact_verification_summaries"."state_changed_at" ASC;

Plan (1.6ms execution): https://postgres.ai/console/gitlab/gitlab-production-ci/sessions/49285/commands/147088

Calculating scope:

SELECT "geo_ci_job_artifact_verification_summaries".*
FROM "geo_ci_job_artifact_verification_summaries"
WHERE "geo_ci_job_artifact_verification_summaries"."state" = 2;

Plan (0.8ms execution): https://postgres.ai/console/gitlab/gitlab-production-ci/sessions/49285/commands/147089

Table size/growth:

Fixed maximum of 100k rows (one per bucket) (data will be backfilled in follow-up issue - #590853 (closed))
Reads: periodic queries for dirty/calculating buckets + SUM aggregations for metrics collection - follow-up issue - #590853 (closed)
Writes: triggered by verification state changes on ci_job_artifact_states, plus calculator worker updates - follow-up issue - #590854
Availability risk: low - small fixed-size table

How to set up and validate locally

Ensure your GDK is set up and running
Run the migration:
```
cd gitlab && bin/rails db:migrate
```

Verify the table exists:

bin/rails runner "puts Geo::CiJobArtifactVerificationSummary.table_exists?"

Expected output: true

Run the specs:

bin/rspec ee/spec/models/geo/ci_job_artifact_verification_summary_spec.rb

Expected output: 6 examples, 0 failures

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #590852 (closed)

Edited Feb 27, 2026 by Scott Murray

Geo: Create geo_ci_job_artifact_verification_summaries table