Try to find pipeline/jobs in the current partition
What does this MR do and why?
This MR introduces a Ci::PartitionableFinder concern that optimizes lookups for CI partitioned records (Ci::Pipeline and CommitStatus/jobs) by first attempting to find them in the current partition before falling back to a full cross-partition scan.
Problem
Several high-frequency Sidekiq workers query CI partitioned tables (p_ci_pipelines, p_ci_builds) without a partition_id, causing full cross-partition scans. As the number of CI partitions grows, these non-pruned queries become progressively more expensive. This was identified as a primary contributor to LockManager LWLock saturation on CI replicas (see incident INC-8367).
Solution
Introduces a new Ci::PartitionableFinder concern with a find_by_id class method that:
- Resolves the current partition ID via
Ci::Partition.current - First queries with
partition_idscoped to the current partition (enabling partition pruning and avoiding cross-partition scans) - Falls back to a full table scan only if the record is not found in the current partition (e.g. for older records in previous partitions)
This concern is included in both Ci::Pipeline and CommitStatus (the base class for all CI jobs/builds).
The behavior is gated behind the ci_partitionable_finder feature flag (gitlab_com_derisk type, disabled by default).
Performance impact
- Happy path (record in current partition): 1 query with partition pruning instead of a full cross-partition scan
- Fallback path (record in an older partition): 2 queries, same as before but with an optimistic first attempt
Changelog: fixed
References
- Related issue: #593701
- Rollout issue: #593873
- Incident review: https://gitlab.com/gitlab-com/gl-infra/production/-/work_items/21534#note_3158996366
- CI partitioning design doc: https://handbook.gitlab.com/handbook/engineering/architecture/design-documents/ci_data_decay/pipeline_partitioning/
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.