Add dependency graph information to Gemnasium's CycloneDX SBOMs
Problem to solve
Support for outputting SBOM files was added to gemnasium in 1 - Update gemnasium to output CycloneDX SBOMs (#350509 - closed).
However, we don't currently support outputting dependency graph information.
Right now graph information is only available in the dependency_files
field Dependency Scanning report,
in the form of dependency paths.
This prevents us from removing the dependency_files
field (#396376 (closed)) or the Dependency Scanning report (as part of &11219).
Dependency paths are supported for the following package managers:
Proposal
Implement the Dependency graph spec of the CycloneDX format in Gemnasium.
This approach has several upsides:
- This allows to possibly leverage existing tools that already report this data in this format rather than doing it from scratch in gemnasium. E.g. Trivy already provides that information by following this spec. This will be closely related to the outcome of Spike: Replace Gemnasium with open source nativ... (#434143).
- This gets us closer to supporting SBOM reports generated by 3rd parties and other custom jobs (there are other blockers for this though).
- This helps supporting future features based on the dependency graph information (e.g. dependency graph visualization, showing other ancerstor paths, knowing critical components heavily depended upon within a project or group/company, etc.)
- While this format allows the backend to generate a full dependency graph and do a lot of things, the first iteration could be simpler and only on par with the currently provided
dependency_path
- which is actually one arbitrary selected "shortest path" of ancestors. The rails platform will be free to evolve at its own pace without requiring changes on the SBOM report. - The full dependency graph could be generated asynchronously, after the SBOM ingestion. This would limit the impact on performance during SBOM ingestion and provide more flexibility to what we want to store. For instance, today only vulnerable dependencies have the
dependency_path
displayed. By doing the dependency graph async, we can know which component has associated vulnerabilities and keep the scope limited to these ones (to limit storage usage if this is still a concern).
Implementation plan
Dependency graph information can be generated directly using scanner.File.Dependencies
because the cyclonedx
can generate a purl
and a sbom-ref
directly from these.
There's no conflict b/c Gemnasium generates one CDX SBOM per input scanner.File
.
It doesn't useful to process the internal dependency graph
that's currently used to generate the .dependency_files
field of the Dependency Scanning report.
-
Update Gemnasium. - Update
cyclonedx.ToSBOMs
function to generate thedependencies
field.- Group
scanner.File.Dependencies
by dependent. - For each dependent package,
- Skip top-level dependencies, that is dependencies where
parser.Dependency.From
isnil
. - Convert
parser.Dependency.From
to.dependencies[].ref
using thepurl
function. - Convert
parser.Dependency.To
to.dependencies[].dependsOn
item using thepurl
function.
- Skip top-level dependencies, that is dependencies where
- Group
- Update unit tests.
- Update expected CDX SBOMs, or create missing ones.
- https://gitlab.com/gitlab-org/security-products/analyzers/gemnasium/-/blob/dfd8e571fe9fb44a28419209e024431db000329a/qa/expect/c-conan/default/gl-sbom-conan-conan.cdx.json
- https://gitlab.com/gitlab-org/security-products/analyzers/gemnasium/-/blob/dfd8e571fe9fb44a28419209e024431db000329a/qa/expect/csharp-nuget-dotnetcore/default/src/web.api/gl-sbom-nuget-nuget.cdx.json
- https://gitlab.com/gitlab-org/security-products/analyzers/gemnasium/-/tree/dfd8e571fe9fb44a28419209e024431db000329a/qa/expect/js-yarn (multiple SBOMs)
- https://gitlab.com/gitlab-org/security-products/analyzers/gemnasium/-/tree/dfd8e571fe9fb44a28419209e024431db000329a/qa/expect/scala-sbt (missing)
- Update
TBD: Introduce a new CI/env variable that enables (or disables) the feature.