Packages changes to support the ci_* database decomposition
Summary
This is the resulting issue from the analysis of the impact of the database composition in the Package "codebase".
Before the decomposition (eg. the ci_* tables will be in a different table), we will need 3 things:
-
Remove the foreign keys from the build info tables to the ci_pipelinestable- For future quick reference, the foreign keys are defined:
ALTER TABLE ONLY packages_build_infos ADD CONSTRAINT fk_rails_17a9a0dffc FOREIGN KEY (pipeline_id) REFERENCES ci_pipelines(id) ON DELETE SET NULL;ALTER TABLE ONLY packages_package_file_build_infos ADD CONSTRAINT fk_rails_3e3f630188 FOREIGN KEY (pipeline_id) REFERENCES ci_pipelines(id) ON DELETE SET NULL; -
When a pipeline is deleted a the pipeline id in the build infos table is "nullified" (that's the ON DELETE SET NULLpart on the FKs above).- To have that behavior back, we will need two
has_many X, dependent: :nullifyin theCi::Pipelinemodel. Why two? because we have two build infos objects: one at the package level and one at the package file level.
- To have that behavior back, we will need two
-
remove the has_many :pipelines, through: :build_infosstatement.- Replace it with a custom function that will break the join query in two:
def pipelines Ci::Pipeline.id_in(build_infos.pluck(:pipeline_id)) end - Investigate a bit if we really need to return all pipelines or if the first one is enough.
- Alternative (from #338660 (closed)): we can ask rails to disable the join query when dealing with a
has_X through:association (see !67834 (comment 647234358)). The resulting SQL is similar to the custompipelinesidea but would more readable.
- Replace it with a custom function that will break the join query in two:
-
Update the .includesto.preloadin order to avoid rails selecting using aLEFT OUTER JOINthat can happen with.includes. See #336166 (comment 655164671).
Improvements
By removing the "direct" dependencies to the ci_pipeline, the Package code will not be impacted by the CI database decomposition.
Risks
Build infos objects are used in:
The risks are thus non trivial, although they should not impact the Package managers API.
Those risks are mitigated by the extensive test suite we have.
Involved components
- https://gitlab.com/gitlab-org/gitlab/-/blob/1ad294be8b96c45060aa6097375b78b33fce8018/app/models/packages/package.rb
- https://gitlab.com/gitlab-org/gitlab/-/blob/1ad294be8b96c45060aa6097375b78b33fce8018/app/models/packages/package_file.rb
- https://gitlab.com/gitlab-org/gitlab/-/blob/1ad294be8b96c45060aa6097375b78b33fce8018/app/models/ci/pipeline.rb
Optional: Intended side effects
- Fetching all pipelines from a package object should trigger 2 queries (pluck + pipelines load) instead of a single one.
Edited by Steve Abrams