Use pipelines_id_range to resolve Ci::Pipeline by id

What does this MR do and why?

Follow-up to !235628 (merged) which added the ci_partitions.pipelines_id_range int8range column populated by Ci::Partitions::SyncService. This MR uses that column to make Ci::Pipeline.find_by_id partition-aware without requiring callers to pass a partition_id, addressing the unpruned single-id queries discussed in #593701 (closed).

Introduces Gitlab::Ci::Pipeline::ByIdLookup, a cascading partition-aware lookup that resolves a pipeline by id in three steps:

  1. The current partition (hot path — most queries land here).
  2. Partitions whose pipelines_id_range contains the id (handles records in older active partitions).
  3. A full cross-partition scan as a last resort (kept for safety on partitions whose range column has not been backfilled yet).

Each fallback is logged so we can monitor how often we leave the fast path.

Ci::Pipeline.find_by_id(id) is reduced to a single line that delegates to ByIdLookup. Ci::Pipeline no longer includes Ci::PartitionableFinder — that concern remains in place for CommitStatus, which has its own id space and cannot reuse pipelines_id_range.

Other changes:

  • Adds an Arel scope Ci::Partition.containing_pipeline_id that queries pipelines_id_range with the @> operator, leveraging the GiST index added by !235628 (merged).
  • Rewrites the Ci::PartitionableFinder concern spec to exercise the mixin against a throw-away model defined in the spec file (since Ci::Pipeline is no longer an includer).

References

Screenshots or screen recordings

gitlabhq_dblab=# explain (analyze, buffers) SELECT "ci_partitions"."id" FROM "ci_partitions" WHERE "ci_partitions"."pipelines_id_range" @> CAST(2292945204 AS int8) ;
                                                                           QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using check_ci_partitions_pipelines_id_range_no_overlap on ci_partitions  (cost=0.12..3.14 rows=1 width=8) (actual time=0.537..0.538 rows=1 loops=1)
   Index Cond: (pipelines_id_range @> '2292945204'::bigint)
   Buffers: shared hit=1 read=1
   I/O Timings: shared read=0.488
 Planning:
   Buffers: shared hit=3 read=1
   I/O Timings: shared read=1.149
 Planning Time: 1.278 ms
 Execution Time: 0.569 ms
(9 rows)

How to set up and validate locally

In rails console:

# Pick a pipeline whose partition has a populated pipelines_id_range
partition = Ci::Partition.where.not(pipelines_id_range: nil).first
pipeline = Ci::Pipeline.where(partition_id: partition.id).first
raise "Need a pipeline in a range-tracked partition" unless pipeline

# 1. Sanity check: fast path (current partition) returns the pipeline
ActiveRecord::Base.logger = Logger.new($stdout)
result = Ci::Pipeline.find_by_id(pipeline.id)
raise "fast path failed" unless result == pipeline
puts "OK: fast path"

# 2. Simulate current partition miss to exercise the range lookup.
#    Stub Ci::Partition.current to point at a non-existent id so step 1 fails.
Ci::Partition.singleton_class.alias_method(:_current_orig, :current)
Ci::Partition.define_singleton_method(:current) { Ci::Partition.new(id: -1) }

begin
  result = Ci::Pipeline.find_by_id(pipeline.id)
  raise "range path failed" unless result == pipeline
  puts "OK: range path"

  # 3. Verify the Arel scope on Ci::Partition
  matched = Ci::Partition.containing_pipeline_id(pipeline.id).to_a
  raise "scope did not match" unless matched.include?(partition)
  puts "OK: containing_pipeline_id scope matches"
ensure
  Ci::Partition.singleton_class.alias_method(:current, :_current_orig)
end

# 4. Verify CommitStatus still uses the original mixin (no regression).
build = CommitStatus.where(partition_id: partition.id).first
puts "OK: CommitStatus.find_by_id returns #{CommitStatus.find_by_id(build.id) == build}" if build

Expected output: four OK: lines.

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Marius Bobin

Merge request reports

Loading