CVS on GLAD changes for language packages detected by Container Scanning

Release notes

Problem to solve

When Container Scanning (CS) is configured to report language specific findings, it generates a CycloneDX (CDX) SBOM that lists language packages (AKA application-level dependencies) similar to the ones reported by Dependency Scanning (DS), like npm packages or Ruby gems.

However,

  1. "Language packages" detected by Container Scanning aren't scanned by Continuous Vulnerability Scanning (CVS) when the GitLab Advisory Database (GLAD) gets new security advisories.
  2. Also, they don't show up in the Dependency List.
  3. If their were scanned by CVS, we would possibly end up with duplicate vulnerabilities.

This issue focuses on 1., that is scanning "language packages" detected by GitLab Container Scanning or similar tools.

The python component of the CI/CD catalog has to a dependency check job based on Trivy. Right now this is disabled by default.

Further details

Limitations in the backend prevent "language components" reported by CS from being ingested and scanned:

  • It can't handle hybrid CycloneDX SBOMs that lists OS packages and "language packages".
  • For "language packages", it requires the SBOM properties specific to GitLab Dependency Scanning. Trivy has equivalent properties but they have different names.
  • It only supports properties defined at the root level, whereas Trivy defines these at the component level (which prevents many advantages).

These limitations apply to any container scanning tool capable of tracking "language packages" in a CycloneDX SBOM (along with OS packages).

Proposal

@fcatteau: In my opinion we should implement proposal B to support "language packages" detected by Trivy out of the box.

  • It's not tied to GitLab Container Scanning, and works with other integrations of Trivy, like py-trivy job of the python CI component.
  • It paves the way for supporting other container scanning generators the same way.
  • Users might need to track these SBOM components for compliance.

However, we would also move forward with proposal C and discourage users from enabling "language findings" in GitLab Container Scanning. This way, by default the amount of vulnerabilities to be triaged doesn't change.

Proposal A

Add gitlab:dependency_scanning properties to the SBOM generated by CS when it contains language packages. The SBOM ingestion logic is updated to handle hybrid SBOMs that combine OS packages w/ language packages.

This has been discussed in !137590 (comment 1679163827).

Pros

  • Vulnerabilities created by CVS have the same UUID regardless of the tool that detected the affected SBOM component. As a result, running CS and DS in the same pipeline doesn't create duplicate vulnerabilities.

Cons

  • We might have to split the SBOM generated by CS: it includes SBOM components of all supported packager managers, but gitlab:dependency_scanning properties assume one SBOM per package manager detected in a directory.
  • It might be difficult and error prone to infer the gitlab:dependency_scanning:input_file (ex: package.json) from the aquasecurity:trivy:FilePath (ex: node_modules/@babel/code-frame/package.json). Also, we loose accuracy as we do that.

Proposal B

UPDATE: This is implemented in part in Store package manager and input file path Sbom:... (#432146 - closed) and Remove the single source per CycloneDX report r... (#439664).

Update SBOM ingestion and CVS to support the aquasecurity:trivy:FilePath property of SBOM components.

Pros

  • The vulnerability has the exact location of the affected SBOM component detected by CS.
  • CS and its CDX SBOM report don't need to change.
  • This paves the way for supporting other SBOM generators, and for merging all the SBOMs generated by gemnasium (when multiple projets or package managers are detected). Related issue: Spike: Replace Gemnasium with open source nativ... (#434143)

Cons

  • SBOM ingestion needs to be changed to support these new properties, and to store properties specific to each Sbom::Occurrence. We could have one Sbom::Source per occurrence to achieve that. (CVS wouldn't change.) #434361 (comment 1713868597)
  • We have duplicate vulnerabilities when CVS scans SBOM components reported by DS and the ones reported by CS. That say, this isn't problem if GitLab can group these vulnerabilities automatically.

Proposal C

Along with proposal A or B, discourage users from enabling "language findings" in Container Scanning b/c they might result in duplicates, and result in more triage work. However, "language findings" would still be supported for compliance, and to support other container scanning tools.

See #434361 (comment 1710671791)

Related issues

Further details

SBOM component reported by Trivy (JSON)
{
  "bom-ref": "pkg:npm/%40babel/code-frame@7.0.0?file_path=node_modules%2F@babel%2Fcode-frame%2Fpackage.json",
  "type": "library",
  "name": "@babel/code-frame",
  "version": "7.0.0",
  "licenses": [
    {
      "license": {
        "name": "MIT"
      }
    }
  ],
  "purl": "pkg:npm/%40babel/code-frame@7.0.0",
  "properties": [
    {
      "name": "aquasecurity:trivy:FilePath",
      "value": "node_modules/@babel/code-frame/package.json"
    },
    {
      "name": "aquasecurity:trivy:LayerDiffID",
      "value": "sha256:5a204c63bef5f7f334ef5ca3a980abd9ea652fcc5b52a970b4821ec2c5f5af88"
    },
    {
      "name": "aquasecurity:trivy:LayerDigest",
      "value": "sha256:553a6d0b24b6762fa3307d2a997bea283ff4a07c038d29d55e3651724036f9ac"
    },
    {
      "name": "aquasecurity:trivy:PkgID",
      "value": "@babel/code-frame@7.0.0"
    },
    {
      "name": "aquasecurity:trivy:PkgType",
      "value": "node-pkg"
    }
  ]
}

Full CDX SBOM generated by Trivy: gl-sbom-report.cdx.json

Intended users

Feature Usage Metrics

Does this feature require an audit event?

/cc @johncrowley @thiagocsf @hacks4oats @gonzoyumo

Edited by Fabien Catteau