CVS on SBOM change

Introduction

Currently CVS is triggered when a new advisory gets detected. The rails backend will find projects with potential affected packages and if their versions are indeed affected it will create the respective vulnerability.

However, we do not create a vulnerability when:

  • the version of a package is changed and hence affected by existing advisories
  • a new package is added that is affected by existing advisories

Proposal

Vulnerabilities are added to or removed from a project whenever its SBOMs change. We can achieve this by triggering CVS on SBOM changes.

Current SBOM ingestion flow

Currently when a pipeline has a completed_with_manual_statuses ([:success, :failed, :canceled, :skipped, :manual]) and the conditions you can see in the diagram are fulfilled then the ::Sbom::IngestReportsWorker schedules for execution the ::Sbom::Ingestion::IngestReportsService which in turn it will:

flowchart TD
    A[EE::Ci::Pipeline] -->|<b>Transition</b>: completed_with_manual_statuses \nif it's not a child pipeline && \ncan store security reports && \nis the default branch && \ncan ingest sbom reports| B(IngestReportsWorker)
    B --> | Schedules for execution | C[::Sbom::Ingestion::IngestReportsService]

Desired SBOM ingestion flow including CVS

diff

Please note that the following diff could be a very first iteration. Perhaps we could be more clever here and before ingesting the report we could fetch the existing report (Probably we are interested insbom_component_version_id from sbom_occurrences table) then ingest the new sbom report and keep the ingested_ids. ingested_ids maps to sbom_occurrences table id column. Hence we can compare the previous occurrences with the new and keep those occurrences ids that actually have changed. Then we can include only those in the Sbom::IngestedSbomEvent. This would minimize the amount of data that we will need to process.

Please also notice that probably we need to publish the Sbom::IngestedSbomEvent if and only if CVS is enabled. Probably we also need to check if the related feature flag is enabled.

SbomIngestionScanWorker has knowledge of the components and its versions since the pipeline id is in the event (we can connect pipeline id to sbom_occurrences). Based on that it can find which components are affected by existing advisories. It can do that by querying pm_affected_packages for the package name, purl type and version being in the affected_range . Then finally it can create the vulnerabilities.

flowchart TD
    A[EE::Ci::Pipeline] -->|<b>2.Transition</b>: completed_with_manual_statuses \nif it's not a child pipeline && \ncan store security reports && \nis the default branch && \ncan ingest sbom reports| B(IngestReportsWorker)
    B --> | <b>3.</b>Schedules for execution | C[::Sbom::Ingestion::IngestReportsService]
    C --> | <b>4.</b>Ingests SBOM report\n<b>5.</b>Publishes event SbomIngestedEvent | D[Gitlab::EventStore]
    E[SbomIngestionScanWorker]--> | <b>1.</b>Subscribes to SbomIngestedEvent| D
    D --> |<b>6.</b>triggers| E

Related issues and resources

Match SBOM components to known advisories (#371055 - closed)

Sbom Related tables

Implementation plan

Subscription
diff --git a/ee/lib/ee/gitlab/event_store.rb b/ee/lib/ee/gitlab/event_store.rb
index cf1d9512ae37..0c94238f5ad4 100644
--- a/ee/lib/ee/gitlab/event_store.rb
+++ b/ee/lib/ee/gitlab/event_store.rb
@@ -42,6 +42,7 @@ def configure!(store)
             if: ->(_) { ::Gitlab::CurrentSettings.elasticsearch_indexing? }
           store.subscribe ::Search::Zoekt::DefaultBranchChangedWorker, to: ::Repositories::DefaultBranchChangedEvent
           store.subscribe ::PackageMetadata::GlobalAdvisoryScanWorker, to: ::PackageMetadata::IngestedAdvisoryEvent
+          store.subscribe ::Sbom::SbomChangeWorker, to: ::Sbom::IngestedSbomEvent
           store.subscribe ::Llm::NamespaceAccessCacheResetWorker, to: ::NamespaceSettings::AiRelatedSettingsChangedEvent
           store.subscribe ::Llm::NamespaceAccessCacheResetWorker, to: ::Members::MembersAddedEvent
           store.subscribe ::Security::RefreshProjectPoliciesWorker,

Copied from Match SBOM components to known advisories (#371055 - closed)

  • Add Gitlab::VulnerabilityScanning::PackageAdvisories class. This class will be called by the newly introduced service.
    • Input: Array of objects that respond to purl_type, name, and version.
      • Names include the namespace.
      • Names are normalized.
    • Fetch PackageMetadata::AffectedPackage models matching the purl_type and name.
      • Preload the advisory field to prevent N+1 queries.
    • Filter out advisories such as the affected range excludes the version.
    • Output: Array of objects with purl_type, name, version, and advisories.

The above plan was implemented in Draft: Add service to match SBOM components and... (!126954 - closed) • Adam Cohen • 16.7, however, we had to postpone that MR due to efficiency concerns.

The crux of the efficiency concern is that a consumer calling Gitlab::VulnerabilityScanning::PackageAdvisories#fetch will end up fetching all of the advisory data at once, with no way of iterating through this information, which could easily lead to a DB query timeout.

In order to solve this, we'll need to change Gitlab::VulnerabilityScanning::PackageAdvisories#fetch from the MR Draft: Add service to match SBOM components and... (!126954 - closed) • Adam Cohen • 16.7 to use each_batch, similar to how this was implemented in Sbom::PossiblyAffectedOccurrencesFinder#execute_in_batches. This will allow consumers of Gitlab::VulnerabilityScanning::PackageAdvisories#fetch to iterate through the result set in batches, thereby reducing the possibility of a DB timeout.

So to the developer that picks up this issue - please start by re-opening Draft: Add service to match SBOM components and... (!126954 - closed) • Adam Cohen • 16.7.

Verification steps

  1. Verify that the performance of the query is acceptable when used in production. See Improve performance of package license query to... (#398679 - closed) and the documentation for example of optimizations that can further scope the query and make an efficient use of the IN operator.

  2. Verify if any additional change is required in order to support the existing security tab.

  3. Verify if any additional change is required in order to support the existing MR widget related feature.

  4. Verify if duplications are occurring when vulnerabilities are also ingested by the dependency scanning report. /cc @zmartins

Edited by Zamir Martins