[PROMOTED] License Scanning using License DB and SBOM components

Why are we doing this work

The LicenseScanning service implemented in !103574 (merged) needs to be changed in order to replace the License Scanning CI job, as captured by &8072 (closed).

  • Legacy beahvior: It parses legacy License Scanning JSON artifacts.
  • New behavior: It matches SBOM components detected in the project branch with license data imported from the License DB.

Further details

In the new behavior, License Scanning does the following:

  1. Get SBOM components for the given pipeline or the default branch.
  2. Query the DB to fetch licenses of these components. #373163 (closed) defines that data is stored.
  3. Build a new Report using its methods, and return it.

It returns a Report model, just like the legacy behavior implemented in Add license scanning report class (!103574 - merged).

The new behavior is behind a feature flag.

Also, License Scanning falls back to the legacy behavior when no SBOM data is available.

SBOM components

In order to get SBOM components of a branch or pipeline, the License Scanning service delegates to one of these:

See &8532 (comment 1160236461) and following comments.

Names of licenses

In this new implementation, the Report model doesn't provide the names of the SPDX licenses that have been detected; #license_names returns nothing. However, this should be fully compatible with the current implementation of license policies. That's because the Managed API reuses existing records of software_licenses, and that table is in sync with the SPDX License List. See #379137 (comment 1160165965)

Relevant links

Non-functional requirements

  • Documentation:
  • Feature flag:
  • Performance:
  • Testing:

Implementation plan

The following steps can be implemented in separate MRs:

  1. Create a finder that gets the SBOM components.
    • It takes a project branch or CI pipeline.
    • It returns tuples of PURL type, package name, and package version..
    1. Create a finder that gets licenses of package versions.
  2. Create a new License Scanning service class that implements the new behavior. It uses the two aforementioned finders.
  3. Move the existing License Scanning service (legacy behavior), and change the existing License Scanning service to that it delegates to it, acting as a simple proxy.
  4. Change the License Scanning proxy to switch to new behavior when it's enabled by a Feature Flag.
  5. Change the License Scanning proxy to fallback to legacy behavior when project isn't compatible with new behavior, even though the FF enables the new behavior.

Verification steps

Check new behavior:

  1. Enable feature flag.
  2. Set up a project supported by SBOM generators, and using package types supported by License DB.
  3. Add corresponding SBOM generators to the CI config.
  4. Trigger a pipeline.
  5. Check licenses in License Compliance page.

Check fallback to legacy behavior:

  1. Enable feature flag.
  2. Set up a project supported by legacy License Scanning job.
  3. Include legacy CI template for License Scanning.
  4. Trigger a pipeline.
  5. Check licenses in License Compliance page.

/cc @ifrenkel @hacks4oats

Edited by Fabien Catteau