Ingest source package name from Trivy SBOM component properties
Proposal
As discussed here, when looking up advisories for a package, trivy first uses the source package, if available, and falls back to the package name. For example, the package libperl5.38 has Source: perl listed in the dpkg manifest:
$ docker run -it --rm registry.gitlab.com/gitlab-org/security-products/analyzers/gemnasium/tmp/python:5db727fd3df8c65d8d85ed470ee79624d728217c bash
root@547bb47b6a06:/# grep -A 9 'Package: libperl5.38' /var/lib/dpkg/status
Package: libperl5.38
Status: install ok installed
Priority: optional
Section: libs
Installed-Size: 29325
Maintainer: Niko Tyni <ntyni@debian.org>
Architecture: amd64
Multi-Arch: same
Source: perl <--------------------------------------- SOURCE PACKAGE IS `perl`
Version: 5.38.0-2
As such, the trivy-db that we use for the source of advisories does not contain vulnerability information for the package libperl5.38 but instead contains advisory information for the source package perl.
When trivy scans an image, if a source package has a vulnerability, trivy considers all packages that have the same source package as being vulnerable.
For example, if perl <= 5.38.0-2 is vulnerable to a particular CVE, then the following packages are also vulnerable, because they all list perl as the source package:
+------------+--------------+---------------------------+-------------------+------------------------------------------------------------------------+
| Unapproved | High | libperl5.38 | 5.38.0-2 | CPAN.pm before 2.35 does not verify TLS certificates when downloading |
| | | | | distributions over HTTPS. |
+------------+--------------+---------------------------+-------------------+------------------------------------------------------------------------+
| Unapproved | High | perl | 5.38.0-2 | CPAN.pm before 2.35 does not verify TLS certificates when downloading |
| | | | | distributions over HTTPS. |
+------------+--------------+---------------------------+-------------------+------------------------------------------------------------------------+
| Unapproved | High | perl-base | 5.38.0-2 | CPAN.pm before 2.35 does not verify TLS certificates when downloading |
| | | | | distributions over HTTPS. |
+------------+--------------+---------------------------+-------------------+------------------------------------------------------------------------+
| Unapproved | High | perl-modules-5.38 | 5.38.0-2 | CPAN.pm before 2.35 does not verify TLS certificates when downloading |
| | | | | distributions over HTTPS. |
+------------+--------------+---------------------------+-------------------+------------------------------------------------------------------------+
(see this job for details)
When we ingest an SBOM for Container Scanning, we currently only store the following fields:
typenamepurlversion
For example, for the package libperl5.38, we have the following fields and values:
| Field | Value |
|---|---|
| type | library |
| name | libperl5.38 |
| purl | pkg:deb/debian/libperl5.38@5.38.0-2?distro=debian-12.1 |
| version | 5.38.0-2 |
This presents a problem, because as stated earlier, the trivy-db does not contain affected package information for libperl5.38, but instead for the source package perl, however, we currently have no way of correlating the libperl5.38 package to the source package perl from only the above details.
However, the source SBOM does contain this information in the properties field, we just don't currently ingest it.
For example, trivy produces an SBOM with the source package perl in the aquasecurity:trivy:SrcName property:
Click to expand trivy-produced SBOM
{
"components": [
{
"bom-ref": "pkg:deb/debian/libperl5.38@5.38.0-2?distro=debian-12.1",
"type": "library",
"name": "libperl5.38",
"version": "5.38.0-2",
"purl": "pkg:deb/debian/libperl5.38@5.38.0-2?distro=debian-12.1",
"properties": [
{
"name": "aquasecurity:trivy:SrcName",
"value": "perl"
}
]
}
And, syft produces an SBOM with the source package perl in the syft:metadata:source property:
Click to expand syft-produced SBOM
{
"components": [
{
"bom-ref": "pkg:deb/debian/libperl5.38@5.38.0-2?arch=amd64&upstream=perl&distro=debian-12&package-id=c2dfca7103136fcb",
"type": "library",
"publisher": "Niko Tyni <ntyni@debian.org>",
"name": "libperl5.38",
"version": "5.38.0-2",
"cpe": "cpe:2.3:a:libperl5.38:libperl5.38:5.38.0-2:*:*:*:*:*:*:*",
"purl": "pkg:deb/debian/libperl5.38@5.38.0-2?arch=amd64&upstream=perl&distro=debian-12",
"properties": [
{
"name": "syft:metadata:source",
"value": "perl"
}
]
}
In order to properly match packages such as libperl5.38 against advisories in the trivy-db for the source package perl, we need to update the SBOM ingestion code in the rails monolith to also store the source package from the component.properties for trivy-produced SBOMs only, which is the purpose of this issue.
Proposals
Previous implementation plan
-
Add a new source_package_namefield to Gitlab::Ci::Reports::Sbom::Component. -
Add a new source_package_namefield to the Sbom::ComponentVersion model:-
Create a migration to add source_package_nameto thesbom_component_versionstable. -
Add a new index to the sbom_component_versionstable:Note: previous implementation plan was about adding a field to
sbom_components, please, see this threadClick to expand original index suggestion which doesn't work
index_sbom_components_on_component_type_source_package_name_and_purl_type" UNIQUE, btree (source_package_name, purl_type, component_type)Note: there's a problem with this index due to the
UNIQUEkeyword, as explained here. Because of this, we'll need to remove theUNIQUEkeyword, as shown in the revised index below.Revised index: (as discussed here):
index_sbom_components_on_component_type_source_package_name_and_purl_type" btree (source_package_name, purl_type, component_type)
-
-
Update Gitlab::Ci::Parsers::Sbom::Cyclonedx#parse_components to ingest the components[].properties[].aquasecurity:trivy:SrcNamevalue and store it insbom_components.source_package_name. -
Add unit tests
Implementation Plan
-
Add a new sbom_source_packagestable:-
Add sbom_source_packages table (!140539 - merged) • Adam Cohen • 16.8 -
Add timestamps for sbom_source_packagestable to enable the ingestion process. Ingestion framework requires table to have timestamps. Add timestamp for sbom_source_packages (!142006 - merged) • Tetiana Chupryna • 16.9
-
-
Add a source_package_namemethod to theSbom::SourceHelpermodule. It returns the value ofdata['SrcName']. -
Delegate the source_package_namemethod to thepropertiesand allownil(components may not have any properties).delegate :source_package_name, to: :properties, allow_nil: true -
Update the Sbom::Ingestion::OccurrenceMapmethod so that it includes asource_package_idaccessor. Update the#to_hmethod so that it outputssource_package_id: source_package_idin the resulting hash. Delegate the#source_package_nameto the:report_component. -
Add a new task to the Sbom::Ingestion::Tasksnamespace. This task will include theGitlab::Ingestion::BulkInsertableTaskmodule.- Name the task
IngestSourcePackageNames - Set
self.modeltoSbom::SourcePackage - Set
self.usesto%i[name purl_type id].freeze. The:idwill be used to set thesource_package_idcolumn, and the:nameand:purl_typeare used as a key to for the:idvalue in a@maps_grouped_by_uniq_attrshash map. - Set
self.unique_byto%i[name purl_type].freeze. - Add an
#attributesmethod that returns a slice of hashes like so:occurrence_maps.filter(&:source_package_name).map do |occurrence_map| { name: occurrence_map.source_package_name, purl_type: occurrence_map.purl_type } end - Add an
after_ingestmethod that sets the returnidvalue as thesource_package_idusing the values from@maps_grouped_by_uniq_attrs. SeeSbom::Ingestion::Tasks::IngestComponentsfor an example implementation.
- Name the task
-
Update the IngestReportSliceService::TASKSarray. Add the newly createdIngestSourcePackageNamesbefore theIngestOccurrencestask. -
Update the Sbom::Ingestion::Tasks::IngestOccurrencesattributes so that it includessource_package_id: occurrence_map.source_package_idin the hash output. -
Ensure that the related specs are updated. The following files in ee/spec/services/sbom/ingestion/will be affected:-
occurrence_map_spec.rb- test that thesource_package_idis assigned inwhen ids are assignedand that it delegates thesource_package_namecorrectly. -
tasks/ingest_occurrences_spec.rb- ensure that the#attributesmethod sets thesource_package_idattribute correctly when it'sniland when it's notnil. -
tasks/ingest_source_packages_spec.rb- ensure that it is idempotent, unique by constraints are utilized, the correctattributesare used (nilsource package names are removed), and that the expected attributes are set after ingest.- For example, you could verify that the
perlandperl-basecomponents both have the samesource_package_idset because they both belong to theperlsource package.
- For example, you could verify that the
-
Validation testing
- Validate Update PossiblyAffectedOccurrencesFinder to wor... (#428681 - closed).
- Create a project with next content:
.gitlab-ci.yml
variables:
CS_IMAGE: 'golang:1.20-alpine'
include:
- template: Jobs/Container-Scanning.gitlab-ci.yml
- Run a pipeline and make sure that
container_scanning:cyclonedxreport is created
GDK
in Rails console run:
Sbom::ComponentVersion.where(component: Sbom::Componenent.find(name: 'alpine-baselayout-data'))
Check if the field source_package_name is equal alpine-baselayout.
GitLab.com
After deploy validate that there is no new errors logged and there is no regression in Group Dependency List.
/cc @gonzoyumo @smeadzinger @fcatteau
Auto-Summary 🤖
Discoto Usage
Points
Discussion points are declared by headings, list items, and single lines that start with the text (case-insensitive)
point:. For example, the following are all valid points:
#### POINT: This is a point* point: This is a point+ Point: This is a point- pOINT: This is a pointpoint: This is a **point**Note that any markdown used in the point text will also be propagated into the topic summaries.
Topics
Topics can be stand-alone and contained within an issuable (epic, issue, MR), or can be inline.
Inline topics are defined by creating a new thread (discussion) where the first line of the first comment is a heading that starts with (case-insensitive)
topic:. For example, the following are all valid topics:
# Topic: Inline discussion topic 1## TOPIC: **{+A Green, bolded topic+}**### tOpIc: Another topicQuick Actions
Action Description /discuss sub-topic TITLECreate an issue for a sub-topic. Does not work in epics /discuss link ISSUABLE-LINKLink an issuable as a child of this discussion
Last updated by this job
Discoto Settings
---
summary:
max_items: -1
sort_by: created
sort_direction: ascending
See the settings schema for details.