Store package manager and input file path Sbom::Occurrence fields when components are detected by Trivy

Why are we doing this work

As discussed in #421041 (comment 1652527636), we'll need access to some Trivy specific properties to populate dependency attributes like the package manager and location. To do so, we'll need to store these fields in the database to retrieve them later.

Relevant links

Non-functional requirements

  • Documentation: The supported Trivy properties should be documented.
  • Feature flag:
  • Performance:
  • Testing:

Proposals

Parse and ingest components[].properties[] field into database

Pros

  • 👍 Properties can be output in SBOM output to recreate the original SBOM

Cons

  • 👎 Increased complexity when querying and indexing a field in the jsonb column
  • 👎 Increased space usage
  • 👎 Database review is required

Only parse components[].properties[] field and use the data to store the packager and input_file_path fields (chosen)

Pros

  • 👍 No increase in space usage per row in sbom_occurrences
  • 👍 No need to add a separate feature flag to stop the ingestion of the props in the database
  • 👍 No increase in query complexity or need to add an index
  • 👍 Faster MR review due to lack of database review step

Cons

  • 👎 We cannot trace where the property came from once it's in the database
  • 👎 The sbom_source properties are not in sync with those in the occurrence entity. While not ideal, we can fix this by ingesting the properties into a components own sbom source in the future. This will synchronize the data on both ends.

Implementation plan

  • Update the Gitlab::Ci::Reports::Sbom::SourceHelper module's #packager and #input_file_path methods so that they resolve the package manager using the aquasecurity:trivy:PkgType value mapping, and the aquasecurity:trivy:FilePath value for the the input file path when the source type is :trivy.
  • Update the ingestion code so that the Sbom::Occurrence model can access these fields. The following are the steps required, working from the beginning of the ingestion pipeline to the end (where the upsert happens).
    • Update the Sbom::Ingestion::OccurrenceMap class
      • Update the #packager and #input_file_path methods so that they first attempt to get the related fields from the report component's #properties, and then fall back to the #source attributes.
    • Update the Sbom::Ingestion::Tasks::IngestOcurrrences class
      • The package_manager and input_file_path attribute should come from the occurrence map and not from the source.

Verification steps

  1. Enable the feature flag ingest_trivy_cdx_properties feature flag globally in your GDK.
  2. Add a Trivy SBOM with the supported properties to a project.
  3. Query the components for the project, and verify that calling properties on the components returns all the Trivy properties that were included.
Edited by Oscar Tovar