Spike: Deduplicate ci_build_names into one of the deduplicated tables

Problem

ci_build_names is used today for job filtering by name. Since this is duplicate data and high-growth (due to be associated to ci_builds), we should look for ways to deduplicate it while maintaining queries efficient.

Proposal

If the build name is something we can delete after pipeline archival, the Ci::JobDefinition is a good place. Today we have retention policy for ci_build_names but this requirement may change in the future.

Otherwise we need to use Ci::JobInfo (a similar table to job definition) for intrinsic data: Create `Ci::JobInfo` model for intrinsic dedupl... (#567709 - closed)

Investigation

  1. Assess if there are any columns we should move into a new model Ci::JobInfo. Do they have to be indexed? Are they immutable? Can they be part of a jsob column like p_ci_job_definitions.config?

  2. Assess the cost of refactoring in terms of complexity and risks.

Expected outcomes

  1. Investigation results outlining why we cannot/should not deduplicate this data, OR

  2. A POC MR with the proposed implementation. We already have a POC for Ci::JobInfo (Draft: POC Deduplicate intrinsic immutable data... (!211540)), so we could build on it for this spike issue.

Edited by Leaminn Ma