Spike: Deduplicate ci_build_needs
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Problem
ci_build_needs table stores a records for every needs entry for every job. There is a huge potential in deduplicating this data
Proposal
Evaluate whether needs could be stored in Ci::JobDefinition but take in consideration that job definitions could be deleted for archived pipelines/partitions.
Maybe we need a similar model for immutable/deduplicatable data that is intrinsic (long term). The need for this model has been identified already. See Spike: Deduplicate `ci_build_sources` (#565806) for example.
Investigation
-
Assess if there are any columns we should move into a new model
Ci::JobInfo. Do they have to be indexed? Are they immutable? Can they be part of a jsob column likep_ci_job_definitions.config? -
Assess the cost of refactoring in terms of complexity and risks.
Expected outcomes
-
Investigation results outlining why we cannot/should not deduplicate this data, OR
-
A POC MR with the proposed implementation. We already have a POC for
Ci::JobInfo(Draft: POC Deduplicate intrinsic immutable data... (!211540)), so we could build on it for this spike issue.