[PROMOTED] License Scanning using License DB and SBOM components
Why are we doing this work
The LicenseScanning service implemented in !103574 (merged) needs to be changed in order to replace the License Scanning CI job, as captured by &8072 (closed).
- Legacy beahvior: It parses legacy License Scanning JSON artifacts.
- New behavior: It matches SBOM components detected in the project branch with license data imported from the License DB.
Further details
In the new behavior, License Scanning does the following:
- Get SBOM components for the given pipeline or the default branch.
- Query the DB to fetch licenses of these components. #373163 (closed) defines that data is stored.
- Build a new
Reportusing its methods, and return it.
It returns a Report model, just like the legacy behavior implemented in Add license scanning report class (!103574 - merged).
The new behavior is behind a feature flag.
Also, License Scanning falls back to the legacy behavior when no SBOM data is available.
SBOM components
In order to get SBOM components of a branch or pipeline, the License Scanning service delegates to one of these:
- if available, an existing finder implemented by groupthreat insights as part of of Use database for project dependency list (&8293 - closed)
- a simplified finder implemented by groupcomposition analysis, which uses ActiveRecord relations; it doesn't cover edge-cases, and it will be replaced by a complete implementation as groupthreat insights moves forward with &8293 (closed)
See &8532 (comment 1160236461) and following comments.
Names of licenses
In this new implementation, the Report model doesn't provide the names of the SPDX licenses that have been detected; #license_names returns nothing. However, this should be fully compatible with the current implementation of license policies. That's because the Managed API reuses existing records of software_licenses, and that table is in sync with the SPDX License List. See #379137 (comment 1160165965)
Relevant links
- Update DB schema to store data imported from th... (#373163 - closed)
- Use License Scanning service (&8532 - closed)
Non-functional requirements
-
Documentation: -
Feature flag: -
Performance: -
Testing:
Implementation plan
The following steps can be implemented in separate MRs:
- Create a finder that gets the SBOM components.
- It takes a project branch or CI pipeline.
- It returns tuples of PURL type, package name, and package version..
-
- Create a finder that gets licenses of package versions.
- It queries the DB tables implemented in Update DB schema to store data imported from th... (#373163 - closed).
- It takes tuples of PURL type, package name, and package version.
- It returns the corresponding licenses.
- Create a new License Scanning service class that implements the new behavior. It uses the two aforementioned finders.
- Move the existing License Scanning service (legacy behavior), and change the existing License Scanning service to that it delegates to it, acting as a simple proxy.
- Change the License Scanning proxy to switch to new behavior when it's enabled by a Feature Flag.
- Change the License Scanning proxy to fallback to legacy behavior when project isn't compatible with new behavior, even though the FF enables the new behavior.
Verification steps
Check new behavior:
- Enable feature flag.
- Set up a project supported by SBOM generators, and using package types supported by License DB.
- Add corresponding SBOM generators to the CI config.
- Trigger a pipeline.
- Check licenses in
License Compliancepage.
Check fallback to legacy behavior:
- Enable feature flag.
- Set up a project supported by legacy License Scanning job.
- Include legacy CI template for License Scanning.
- Trigger a pipeline.
- Check licenses in
License Compliancepage.
/cc @ifrenkel @hacks4oats