Skip to content

Fix Security feature ::Ci::Pipeline.with_vulnerabilities cross-joining between CI and non-CI tables

From !62092 (closed) we discovered ::Ci::Pipeline.with_vulnerabilities joins between non-CI and CI tables. This will not be possible when we move all ci_* tables to another database.

The query in question is:

SELECT "ci_pipelines".* FROM "ci_pipelines"
WHERE (EXISTS (
  SELECT 1 FROM "vulnerability_occurrence_pipelines"
  WHERE (ci_pipelines.id=vulnerability_occurrence_pipelines.pipeline_id)))

Options

  1. Is it possible to rename vulnerability_occurrence_pipelines to ci_vulnerability_occurrence_pipelines since it seems closely coupled to ci stuff? Is this table joined elsewhere outside of ci_*? If not it may be easier to just rename it and it will move with all other CI tables
  2. Is it possible to remove or somehow refactor this code so that this isn't necessary?
  3. When this scope is used is it possible to pluck all the previously scoped pipeline ids into memory so that we don't need to actually do a join?
  4. Is there a new table we could create to store the relevant data we need to construct these queries? It will be de-normalized and storing redundant info that is on the other tables, can be written to only after the data needs to be stored (eg. after a build finishes) and can be written in sidekiq and then it will allow much faster querying for the specific data we need for the feature.

Sharding team recommended option

See #336590 (closed) as it looks like we may very well just use the same solution for several similar problems.

Implementation plan:

Edited by Subashis Chakraborty