Packages changes to support the ci_* database decomposition

Summary

This is the resulting issue from the analysis of the impact of the database composition in the Package "codebase".

Before the decomposition (eg. the ci_* tables will be in a different table), we will need 3 things:

  1. Remove the foreign keys from the build info tables to the ci_pipelines table
    • For future quick reference, the foreign keys are defined:
      ALTER TABLE ONLY packages_build_infos
          ADD CONSTRAINT fk_rails_17a9a0dffc FOREIGN KEY (pipeline_id) REFERENCES ci_pipelines(id) ON DELETE SET NULL;
      ALTER TABLE ONLY packages_package_file_build_infos
          ADD CONSTRAINT fk_rails_3e3f630188 FOREIGN KEY (pipeline_id) REFERENCES ci_pipelines(id) ON DELETE SET NULL;
  2. When a pipeline is deleted a the pipeline id in the build infos table is "nullified" (that's the ON DELETE SET NULL part on the FKs above).
    • To have that behavior back, we will need two has_many X, dependent: :nullify in the Ci::Pipeline model. Why two? because we have two build infos objects: one at the package level and one at the package file level.
  3. remove the has_many :pipelines, through: :build_infos statement.
    • Replace it with a custom function that will break the join query in two:
         def pipelines
           Ci::Pipeline.id_in(build_infos.pluck(:pipeline_id))
         end
    • Investigate a bit if we really need to return all pipelines or if the first one is enough.
    • Alternative (from #338660 (closed)): we can ask rails to disable the join query when dealing with a has_X through: association (see !67834 (comment 647234358)). The resulting SQL is similar to the custom pipelines idea but would more readable.
  4. Update the .includes to .preload in order to avoid rails selecting using a LEFT OUTER JOIN that can happen with .includes. See #336166 (comment 655164671).

Improvements

By removing the "direct" dependencies to the ci_pipeline, the Package code will not be impacted by the CI database decomposition.

Risks

Build infos objects are used in:

The risks are thus non trivial, although they should not impact the Package managers API.

Those risks are mitigated by the extensive test suite we have.

Involved components

Optional: Intended side effects

  • Fetching all pipelines from a package object should trigger 2 queries (pluck + pipelines load) instead of a single one.
Edited by Steve Abrams