CVS on SBOM change
Introduction
Currently CVS is triggered when a new advisory gets detected. The rails backend will find projects with potential affected packages and if their versions are indeed affected it will create the respective vulnerability.
However, we do not create a vulnerability when:
- the version of a package is changed and hence affected by existing advisories
- a new package is added that is affected by existing advisories
Proposal
Vulnerabilities are added to or removed from a project whenever its SBOMs change. We can achieve this by triggering CVS on SBOM changes.
Current SBOM ingestion flow
Currently when a pipeline has a completed_with_manual_statuses ([:success, :failed, :canceled, :skipped, :manual]) and the conditions you can see in the diagram are fulfilled then the ::Sbom::IngestReportsWorker schedules for execution the ::Sbom::Ingestion::IngestReportsService which in turn it will:
- ingest valid SBOM reports and return the ingested
ids -
delete sbom component occurrences that are not present in the SBOM report ingestion (based on the ingested
ids) - set for the particular project the
latest_ingested_sbom_pipeline_idin redis.
flowchart TD
A[EE::Ci::Pipeline] -->|<b>Transition</b>: completed_with_manual_statuses \nif it's not a child pipeline && \ncan store security reports && \nis the default branch && \ncan ingest sbom reports| B(IngestReportsWorker)
B --> | Schedules for execution | C[::Sbom::Ingestion::IngestReportsService]
Desired SBOM ingestion flow including CVS
-
SbomIngestionScanWorkeris a worker responsible for performing CVS for one project due to an SBOM ingestion event. -
SbomIngestedEventis an event published by the::Sbom::Ingestion::IngestReportsServiceonce the SBOM ingestion is completed. More info can be found on Create an event when an SBOM is ingested that w... (#465544 - closed)
diff
Please note that the following diff could be a very first iteration. Perhaps we could be more clever here and before ingesting the report we could fetch the existing report (Probably we are interested insbom_component_version_id from sbom_occurrences table) then ingest the new sbom report and keep the ingested_ids. ingested_ids maps to sbom_occurrences table id column. Hence we can compare the previous occurrences with the new and keep those occurrences ids that actually have changed. Then we can include only those in the Sbom::IngestedSbomEvent. This would minimize the amount of data that we will need to process.
Please also notice that probably we need to publish the Sbom::IngestedSbomEvent if and only if CVS is enabled. Probably we also need to check if the related feature flag is enabled.
SbomIngestionScanWorker has knowledge of the components and its versions since the pipeline id is in the event (we can connect pipeline id to sbom_occurrences). Based on that it can find which components are affected by existing advisories. It can do that by querying pm_affected_packages for the package name, purl type and version being in the affected_range . Then finally it can create the vulnerabilities.
flowchart TD
A[EE::Ci::Pipeline] -->|<b>2.Transition</b>: completed_with_manual_statuses \nif it's not a child pipeline && \ncan store security reports && \nis the default branch && \ncan ingest sbom reports| B(IngestReportsWorker)
B --> | <b>3.</b>Schedules for execution | C[::Sbom::Ingestion::IngestReportsService]
C --> | <b>4.</b>Ingests SBOM report\n<b>5.</b>Publishes event SbomIngestedEvent | D[Gitlab::EventStore]
E[SbomIngestionScanWorker]--> | <b>1.</b>Subscribes to SbomIngestedEvent| D
D --> |<b>6.</b>triggers| E
Related issues and resources
Match SBOM components to known advisories (#371055 - closed)
Implementation plan
-
Prior to commit with a specific implementation path, provide performance stats for each approach (e.g., using a single pipeline id, filtering out sbom_componentsbased on the latest pipeline id). -
Create an Sbom::SbomChangeWorkerthat subscribes to the EventStore to theIngestedSbomEventintroduced by Create an event when an SBOM is ingested that w... (#465544 - closed). Then this worker will call a service that will eventually fetch the sbom components fromsbom_occurencestable based on thepipeline_idin the event. -
Add FF into https://docs.gitlab.com/ee/user/feature_flags.html#gitlab-community-edition-and-enterprise-edition
Subscription
diff --git a/ee/lib/ee/gitlab/event_store.rb b/ee/lib/ee/gitlab/event_store.rb
index cf1d9512ae37..0c94238f5ad4 100644
--- a/ee/lib/ee/gitlab/event_store.rb
+++ b/ee/lib/ee/gitlab/event_store.rb
@@ -42,6 +42,7 @@ def configure!(store)
if: ->(_) { ::Gitlab::CurrentSettings.elasticsearch_indexing? }
store.subscribe ::Search::Zoekt::DefaultBranchChangedWorker, to: ::Repositories::DefaultBranchChangedEvent
store.subscribe ::PackageMetadata::GlobalAdvisoryScanWorker, to: ::PackageMetadata::IngestedAdvisoryEvent
+ store.subscribe ::Sbom::SbomChangeWorker, to: ::Sbom::IngestedSbomEvent
store.subscribe ::Llm::NamespaceAccessCacheResetWorker, to: ::NamespaceSettings::AiRelatedSettingsChangedEvent
store.subscribe ::Llm::NamespaceAccessCacheResetWorker, to: ::Members::MembersAddedEvent
store.subscribe ::Security::RefreshProjectPoliciesWorker,
Copied from Match SBOM components to known advisories (#371055 - closed)
-
Add Gitlab::VulnerabilityScanning::PackageAdvisoriesclass. This class will be called by the newly introduced service.- Input: Array of objects that respond to
purl_type,name, andversion.- Names include the namespace.
- Names are normalized.
- Fetch
PackageMetadata::AffectedPackagemodels matching thepurl_typeandname.- Preload the
advisoryfield to prevent N+1 queries.
- Preload the
- Filter out advisories such as the affected range excludes the
version.- Use class implemented in #371995 (closed).
- Output: Array of objects with
purl_type,name,version, andadvisories.
- Input: Array of objects that respond to
The above plan was implemented in Draft: Add service to match SBOM components and... (!126954 - closed) • Adam Cohen • 16.7, however, we had to postpone that MR due to efficiency concerns.
The crux of the efficiency concern is that a consumer calling Gitlab::VulnerabilityScanning::PackageAdvisories#fetch will end up fetching all of the advisory data at once, with no way of iterating through this information, which could easily lead to a DB query timeout.
In order to solve this, we'll need to change Gitlab::VulnerabilityScanning::PackageAdvisories#fetch from the MR Draft: Add service to match SBOM components and... (!126954 - closed) • Adam Cohen • 16.7 to use each_batch, similar to how this was implemented in Sbom::PossiblyAffectedOccurrencesFinder#execute_in_batches. This will allow consumers of Gitlab::VulnerabilityScanning::PackageAdvisories#fetch to iterate through the result set in batches, thereby reducing the possibility of a DB timeout.
So to the developer that picks up this issue - please start by re-opening Draft: Add service to match SBOM components and... (!126954 - closed) • Adam Cohen • 16.7.
Verification steps
-
Verify that the performance of the query is acceptable when used in production. See Improve performance of package license query to... (#398679 - closed) and the documentation for example of optimizations that can further scope the query and make an efficient use of the
INoperator. -
Verify if any additional change is required in order to support the existing security tab.
-
Verify if any additional change is required in order to support the existing MR widget related feature.
-
Verify if duplications are occurring when vulnerabilities are also ingested by the dependency scanning report. /cc @zmartins