Fix Security feature ::Ci::Pipeline.with_vulnerabilities cross-joining between CI and non-CI tables
From !62092 (closed) we discovered ::Ci::Pipeline.with_vulnerabilities
joins between non-CI and CI tables. This will not be possible when we move all ci_*
tables to another database.
The query in question is:
SELECT "ci_pipelines".* FROM "ci_pipelines"
WHERE (EXISTS (
SELECT 1 FROM "vulnerability_occurrence_pipelines"
WHERE (ci_pipelines.id=vulnerability_occurrence_pipelines.pipeline_id)))
Options
- Is it possible to rename
vulnerability_occurrence_pipelines
toci_vulnerability_occurrence_pipelines
since it seems closely coupled toci
stuff? Is this table joined elsewhere outside ofci_*
? If not it may be easier to just rename it and it will move with all other CI tables - Is it possible to remove or somehow refactor this code so that this isn't necessary?
- When this scope is used is it possible to pluck all the previously scoped pipeline ids into memory so that we don't need to actually do a join?
- Is there a new table we could create to store the relevant data we need to construct these queries? It will be de-normalized and storing redundant info that is on the other tables, can be written to only after the data needs to be stored (eg. after a build finishes) and can be written in sidekiq and then it will allow much faster querying for the specific data we need for the feature.
Sharding team recommended option
See #336590 (closed) as it looks like we may very well just use the same solution for several similar problems.
Implementation plan:
-
Remove ::Ci::Pipeline.with_vulnerabilities
scope and related specs
Edited by Subashis Chakraborty