Skip to content

Improve DB schema to better support CVS and the Dependency List

Problem to solve

The sbom_* tables and related Sbom::* models were introduced so that SBOM ingestion persists all the data needed to support two distinct features:

  • Continuous Vulnerability Scanning
  • Dependency List (visualization and export)

(As of today License Scanning uses SBOM report artifacts directly.)

However, over time SBOM ingestion and the sbom_* table were optimized for one feature or the other.

This impacts negatively both features.

  • Dependency List
    • Components with a type or PURL type not supported by CVS are simply ignored. The SBOM might be incomplete.
    • Component names don't reflect what's in the SBOM. The SBOM might be inaccurate.
  • CVS
    • The schema can't be easily optimized for CVS queries. TODO: provide examples.
    • Versions accurately reflect what's in the SBOM, but they should be sanitized to be efficiently compared to the advisory DB by CVS. Also, having raw version strings isn't consistent with having normalized and sanitized component names.

Proposal

Change the DB schema to achieve the following:

  • The Dependency List and the CycloneDX SBOM are accurate and complete.
  • DB tables and indexes can be optimized for CVS without impacting the Dependency List negatively, and the other way around.
  • Sbom:Occurrence can accurately track a binary package and its source package w/o introducing any extra cost. #427095 (closed)

Proposal A

During SBOM ingestion, normalize and persist properties needed for CVS in a dedicated table, and link them to the corresponding Sbom::Occurrence model using a foreign key. SBOM components not supported by CVS would have records in the existing sbom_occurrences table, but not in the new table dedicated to CVS queries.

Proposal B

  • Share PackageMetadata::Package b/w license data and advisory data. We assume that license data cover packages that might get advisories in the future.
  • Add a relation table to link Sbom::Occurrence models to PackageMetadata::Package, and create relations during SBOM ingestion.

Pros compared to proposal A

  • Save on storage.

Cons compared to proposal A

  • It introduces some coupling b/w license data and advisory data.
  • During advisory ingestion, CVS can't create vulnerabilities for Sbom::Occurrence unless the corresponding PackageMetadata::Package already exist.
  • This seems against the isolation b/w package metadata and SBOM data discussed in gitlab_schema for package metadata used by Vuln... (#378261 - closed).
Edited by Fabien Catteau