Use pipelines_id_range to resolve Ci::Pipeline by id
What does this MR do and why?
Follow-up to !235628 (merged) which added the ci_partitions.pipelines_id_range int8range column populated by Ci::Partitions::SyncService. This MR uses that column to make Ci::Pipeline.find_by_id partition-aware without requiring callers to pass a partition_id, addressing the unpruned single-id queries discussed in #593701 (closed).
Introduces Gitlab::Ci::Pipeline::ByIdLookup, a cascading partition-aware lookup that resolves a pipeline by id in three steps:
- The current partition (hot path — most queries land here).
- Partitions whose
pipelines_id_rangecontains the id (handles records in older active partitions). - A full cross-partition scan as a last resort (kept for safety on partitions whose range column has not been backfilled yet).
Each fallback is logged so we can monitor how often we leave the fast path.
Ci::Pipeline.find_by_id(id) is reduced to a single line that delegates to ByIdLookup. Ci::Pipeline no longer includes Ci::PartitionableFinder — that concern remains in place for CommitStatus, which has its own id space and cannot reuse pipelines_id_range.
Other changes:
- Adds an Arel scope
Ci::Partition.containing_pipeline_idthat queriespipelines_id_rangewith the@>operator, leveraging the GiST index added by !235628 (merged). - Rewrites the
Ci::PartitionableFinderconcern spec to exercise the mixin against a throw-away model defined in the spec file (sinceCi::Pipelineis no longer an includer).
References
- Related issue: #593701 (closed)
- Predecessor MR: !235628 (merged) (added the
pipelines_id_rangecolumn andSyncServicebackfill)
Screenshots or screen recordings
gitlabhq_dblab=# explain (analyze, buffers) SELECT "ci_partitions"."id" FROM "ci_partitions" WHERE "ci_partitions"."pipelines_id_range" @> CAST(2292945204 AS int8) ;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using check_ci_partitions_pipelines_id_range_no_overlap on ci_partitions (cost=0.12..3.14 rows=1 width=8) (actual time=0.537..0.538 rows=1 loops=1)
Index Cond: (pipelines_id_range @> '2292945204'::bigint)
Buffers: shared hit=1 read=1
I/O Timings: shared read=0.488
Planning:
Buffers: shared hit=3 read=1
I/O Timings: shared read=1.149
Planning Time: 1.278 ms
Execution Time: 0.569 ms
(9 rows)How to set up and validate locally
In rails console:
# Pick a pipeline whose partition has a populated pipelines_id_range
partition = Ci::Partition.where.not(pipelines_id_range: nil).first
pipeline = Ci::Pipeline.where(partition_id: partition.id).first
raise "Need a pipeline in a range-tracked partition" unless pipeline
# 1. Sanity check: fast path (current partition) returns the pipeline
ActiveRecord::Base.logger = Logger.new($stdout)
result = Ci::Pipeline.find_by_id(pipeline.id)
raise "fast path failed" unless result == pipeline
puts "OK: fast path"
# 2. Simulate current partition miss to exercise the range lookup.
# Stub Ci::Partition.current to point at a non-existent id so step 1 fails.
Ci::Partition.singleton_class.alias_method(:_current_orig, :current)
Ci::Partition.define_singleton_method(:current) { Ci::Partition.new(id: -1) }
begin
result = Ci::Pipeline.find_by_id(pipeline.id)
raise "range path failed" unless result == pipeline
puts "OK: range path"
# 3. Verify the Arel scope on Ci::Partition
matched = Ci::Partition.containing_pipeline_id(pipeline.id).to_a
raise "scope did not match" unless matched.include?(partition)
puts "OK: containing_pipeline_id scope matches"
ensure
Ci::Partition.singleton_class.alias_method(:current, :_current_orig)
end
# 4. Verify CommitStatus still uses the original mixin (no regression).
build = CommitStatus.where(partition_id: partition.id).first
puts "OK: CommitStatus.find_by_id returns #{CommitStatus.find_by_id(build.id) == build}" if buildExpected output: four OK: lines.
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.