Skip to content

Generate new in-memory Dependency Scanning report from advisories affecting SBOM components

Why are we doing this work

In order to begin continuously scanning components, will need a scanner class that does the following:

  1. Generates an empty dependency scanning report with the appropriate scanner and scan data.
  2. Fetches advisories using finder/service that will be implemented in Match SBOM components to known advisories (#371055 - closed).
  3. Normalizes advisory data and uses the data to add a finding in the dependency scanning report.

Specifically, this issue focuses on points 1 and 3 to maximize the work that can be done in parallel.

Relevant links

Non-functional requirements

  • Documentation:
  • Feature flag: No
  • Performance:
  • Testing: Specs should be added for the functionality added.

Implementation plan

Report builder classes

  • Create new report builder classes e.g. SecurityReportBuilder, ContainerScanningReportBuilder, and DependencyScanningReportBuilder. These classes will be responsible for building the report programmatically for reach report type. The last two will extend from the SecurityReportBuilder base class.
  • The report builders should establish a method for adding components and their advisories. The method add_component_advisories that takes a Gitlab::Ci::Reports::Sbom::Component and Array<PackageMetadata::Advisory> as arguments is proposed.
  • The report builders might require the following private methods:
    • #build_scanner_data: calls ::Gitlab::Ci::Reports::Security::Scanner.new to create a standard scanner class. See this snippet for an example on how to do this. Must be added to the report using #add_scanner.
    • #build_scan: calls Security::Scan.new to create a new scan object. See this snippet for an example of how to do this. Must be added to the constructor when creating a new instance of ::Gitlab::Ci::Reports::Security::Finding.
    • #build_finding_data: creates a hash that holds all necessary data that surfaces to a user. See this snippet for an example of how to do this.
    • #build_identifiers: creates identifiers from the advisory. See this snippet for an example of how to do this. Historically, the fields would be generated using the functionality provided by the gitlab.com/gitlab-org/security-products/analyzers/report/v3 library, so this is also a recommended reference.
    • #build_links: See this snippet for an example of how to do this. The input is an array of strings in the example provided.
    • #build_location: See this snippet for an example of how to do this.
    • #build_severity - see snippet
    • ~~#build_confidence - see ~~snippet deprecated: Use 'unknown' consistently for now.
    • #build_uuid - see snippet on how this is done for normal security reports

Scanner

  • Create a new scanner class e.g. ee/lib/gitlab/vulnerability_scanning/sbom_scanner.rb.
  • Implement initializer.
    • The scanner class is initialized with an instance of a build and a SBOM report.
    • On initialization, the SBoM is validated. Do we have enough data to perform the scan? Is the SBoM valid?
  • Implement #report method.
    • Build a report using the report builder.
    • Convert SBOM components to objects that respond to purl_type, name, and version.
      • name includes the PURL namespace, and it's normalized.
      • We might refactor and implement this in Sbom::Component instead of repeating the code we already have in LicenseScanning::PipelineComponents.
    • Get advisories for components using the PackageAdvisories class introduced in #371055 (closed).
    • For each advisory of each affected component, add findings using #add_component_advisories.

Dependency Scanning vs Container Scanning

The following are the differences between the dependency scanning and container scanning classes that highlight were we should leverage the base class logic and where we should override.

Vulnerabilities - represented by Gitlab::Ci::Reports::Security::Finding:

  • Dependency Scanning includes details, cve and name inside the vulnerability object. The cve field is always empty.

Scan - represented by Gitlab::Ci::Reports::Security::Scanner:

  • The scan.analyzer.url field is exclusive to dependency scanning (although I think it would make sense to add it to container scanning as well).
# Dependency Scanning - vendor is `GitLab`
╭─────────┬─────────────────────────────────────────────────────────────────────╮
│ id      │ gemnasium                                                           │
│ name    │ Gemnasium                                                           │
│ url     │ https://gitlab.com/gitlab-org/security-products/analyzers/gemnasium │
│ vendor  │ {record 1 field}                                                    │
│ version │ 4.0.3                                                               │
╰─────────┴─────────────────────────────────────────────────────────────────────╯
# Container Scanning - vendor is `GitLab`
╭─────────┬───────────────────────────╮
│ id      │ gcs                       │
│ name    │ GitLab Container Scanning │
│ vendor  │ {record 1 field}          │
│ version │ 6.1.1                     │
╰─────────┴───────────────────────────╯

Remediations:

  • The remediations field is exclusive to container scanning.

Dependency Files:

  • The dependency_files field is exclusive to container scanning.

Method Analysis:

Method Shared? Purpose
build_security_report Build the security report. The report builder can hold a report type and use it here for reuse.
build_scanner Builds scanner that originated the SBoM. One of Gitlab Container Scanning or Gemnasium.
build_uuid Used to deduplicate findings.
build_location Location of the source for the vulnerability. Exclusive to dependency scanning.
build_details Details of vulnerable package. Exclusive to dependency scanning.
build_links + build_link Links for all related advisories.
build_original_data A hash that contains a representation of the vulnerability JSON data.
build_finding_name The title/name of the vulnerability.
build_identifiers + build_identifiers Identifiers are the advisories related to the finding.
build_findings + build_finding Creates a finding from the reported advisory.

Other proposals considered

The following proposals were also considered when selecting an implementation plan

Generate the report as JSON and re-use parser

This proposal requires us to convert the components and vulnerabilities to JSON objects inside of a security report. The JSON report is then parsed by the Gitlab::Ci::Parsers::Security::Common.parse! method.

Pros

  • Schema validation
  • Easier to understand - the JSON objects produced are human-readable.
  • Easier to test - we can take a SBoM and the specific version of the GLAD used for a DS job as inputs. If our conversion is a pure function, the output should match the DS report that was generated by gemnasium alongside the SBoM.
  • We can re-create the gl-dependeny-scanning-report.json, and in the future the gl-container-scanning-report.json artifacts for users to download if required.
  • Code reuse

Cons

  • Performance might suffer if we're marshalling JSON objects only for this to be undone.

Verification steps

None.

Edited by Fabien Catteau