Decouple `Sbom::Ingestion::Tasks::IngestOccurrencesVulnerabilities` and the `vulnerability_finding_pipeline` table
Background Context
As part of the epic to delete the vulnerability_finding_pipelines
table, we need to migrate any application code using that table to a new query
This issue
The Sbom::Ingestion::Tasks::IngestOccurrencesVulnerabilities
task ends up using this table when it calls occurrence_map.vulnerability_ids
. That method returns an array of vulnerability IDs, that are provided by this query.
The Sbom::OccurrencesVulnerability
being populated in this task is ultimately only used in this API call.
Implementation Plan
We considered just updating the API call to directly query vulnerability_occurrences
, similar to what we did in MR 162713.
However, that change will take longer and is a bit tangential to the task at hand (dropping the vulnerability_finding_pipelines
table).
So, we instead will add an id
aggregation in the occurrence ingestion task:
@@ -124,6 +130,7 @@ def build_vulnerabilities_info
occurrence_maps.name,
occurrence_maps.version,
occurrence_maps.path,
+ array_agg(vulnerability_occurrences.vulnerability_id) as vulnerability_ids,
MAX(vulnerability_occurrences.severity) as highest_severity,
COUNT(vulnerability_occurrences.id) as vulnerability_count
SQL
That will end up giving us string
values that look like this in the query result:
'{1,2,3,4,5}'
We can parse that back out to ruby integers via:
'{1,2,3,4,5}'
.gsub(/[{}]/, '')
.split(',')
.map(&:to_i)
NOTE: This assumes we will not have NULL
values or non-integer values in the result.
We can then modify the occurrence_map
class to make vulnerability_ids
a full attr_accesor
(as opposed to the current attr_reader
) and then pass these parsed values in:
@@ -72,6 +78,8 @@ def attributes
project
)
+ occurrence_map.vulnerability_ids = vulnerability_data.vulnerability_ids
+
new_attributes = {
project_id: project.id,
pipeline_id: pipeline.id,
The downstream1 Sbom::Ingestion::Tasks::IngestOccurrencesVulnerabilities
task will then work properly, regardless of the state of the deprecate_vulnerability_occurrence_pipelines
feature flag.
/cc @bwill @nmccorrison
-
By "downstream" here I mean its code gets executed after the task where we will now be populating the data. See the task order in
Sbom::ingestion::IngestReportSliceService
. This increase in inter-task coupling and order dependence is part of what makes this solution a bit janky↩