Ingest vulnerabilities from multiple projects at once

Problem to solve

During the first iteration of Continuous Vulnerability Scans (&9534 (closed)) it was pointed that the existing logic for creating Vulnerabilities was optimized for Security Report ingestion and not fitting nicely with the workflow of continuous scans.

Indeed, the current workflow is based on the intent of creating several vulnerabilities for the same pipeline and the same project when ingesting reports. In the context of Continuous Vulnerability Scans, we instead need to create the same vulnerabilities for many projects at the same time.

To limit the scope of change and deliver quickly, we worked with this constraint but it was pointed out by the TI team (see relevant links) that their could be a performance issue. We decided to pursue for the launch as an Experiment and validate the approach with performance testing this service for CVS.

The performance testing conclusion indicates that there is a performance limitation in vulnerability creation rate which prevents enabling CVS globally for all GitLab Ultimate projects and is a blocker to make CVS generally available.

Impact of not doing this change

Using the existing service as-is could potentially have the following impact:

cause CVS scanning jobs (sidekiq) to run for too long and timeout (the risk is also higher for container scanning than dependency scanning)
sidekiq jobs could be enqueued at a higher rate than they get completed, causing the CVS feature to be useless.
the pressure on the vulnerability related table will increase and likely cause more locks, which would impact the performance of other ingestion services (like the regular ingestion of security reports)

Domain(s) involved

This work requires domain knowledge specifc to vulnerability ingestion and the related DB structure. As pointed in our delineation doc this domain belongs to groupthreat insights.

The Composition Analysis team understands the work to be done but has limited knowledge of the data structure and the components that must be updated to achieve this goal. We are also less knowledgeable about the potential impact of the changes we are making and thus we heavily rely on TI engineers when doing them. This has been pointed out multiple times during the design and reviews of the first iteration of CVS.

Additional details

When a new advisory is ingested by the backend and when it's recent (i.e. published less than 14 days ago), this triggers an AdvisoryScanWorker job for the advisory. The worker delegates to the VulnerabilityScanning::AdvisoryScanner which collects affected SBOM occurrences, and calls VulnerabilityScanning::CreateVulnerabilityService for each of them. Right now CreateVulnerabilityService directly delegates to the Security::Ingestion::IngestReportSliceService.

However, IngestReportSliceService operates on a single pipeline. Because of this limitation, it's called for every single vulnerability detected by an AdvisoryScanWorker job. A single job might need to create thousands of vulnerabilities that way, at a very low pace (i.e. hundreds of advisories per minute).

Internally, IngestReportSliceServices relies on many ingestion tasks that inherit from the AbstractTask, which is initialized with a pipeline.

module Security
  module Ingestion
    class AbstractTask
      def self.execute(pipeline, finding_maps)
        new(pipeline, finding_maps).execute
      end

      def initialize(pipeline, finding_maps)
        @pipeline = pipeline
        @finding_maps = finding_maps
      end

To be efficient, we need a new ingestion service that considers the project and pipeline specific to the finding map being ingested.

The VulnerabilityScanning::AdvisoryScanner and the VulnerabilityScanning::CreateVulnerabilityService need to be adjusted.

NOTE: CVS uses its own VulnerabilityScanning::FindingMap class which behaves a finding map.

Relevant links

Requirements

Implement new ingestion service that takes pairs of pipelines and finding maps
Service must create vulnerabilities at nearly the same rate as when ingesting security reports
Update the service implemented in Add service to match new advisory against the S... (#371065 - closed)

Verification steps

Repeat some of the verification steps performed in Add service to match new advisory against the S... (#371065 - closed).

/cc @minac @hacks4oats

Edited Oct 24, 2023 by Fabien Catteau