Skip to content

Advisories stored in vulnerability_advisories apply to all package types (collision)

Problem to solve

Security advisories stored in DB table vulnerability_advisories don't specify the PURL type of the affected package, resulting in collision.

vulnerability_advisories will be used to perform continuous vulnerability scans. See &9534 (closed)

Further details

Right now vulnerability_advisories.component_name is the only DB column that references the component affected by a vulnerability. However, this is not sufficient to uniquely identify a component, and advisories should also have a PURL type to avoid collisions.

In the case the DB contains at least one affected version when an advisory is added, then we can use sbom_vulnerable_component_versions (relation table) to get the ID of the affected component. However, when ingesting a vulnerability advisory there's no guarantee that the DB already contains affected versions. (Also, advisories have affected ranges from which it might not be possible to infer any affected version number.) It thus might be impossible to link the advisory being ingested to the affected component via sbom_vulnerable_component_versions, even if we're able to insert into sbom_components using the raw ingested data.

Proposal

We need to pick one of the following proposals.

  1. Add DB column vulnerability_advisories.purl_type to store the PURL type.

    Combined with vulnerability_advisories.component_name, purl_type identifies the affected package without collision. This has been proposed as part of #364576 (closed) (component_type column). NOTE: We might have to normalize package names when comparing vulnerability_advisories.component_name to sbom_components.name.

  2. Add DB column vulnerability_advisories.component_id to reference rows of sbom_components (FK), and drop vulnerability_advisories.component_name.

    A vulnerability advisory applies to exactly one SBOM component. SBOM components need to be upserted when adding new vulnerability advisories.

  3. Add DB column vulnerability_advisories.pm_package_id to reference rows of pm_packages (FK), drop vulnerability_advisories.component_name, and rename table to pm_vulnerability_advisories.

    A vulnerability advisory applies to exactly one package referenced by License DB. Packages need to be upserted when adding new vulnerability advisories, unless the vulnerability advisories are also imported from License DB. See Spike: How do we sync the backend with a source... (#394723 - closed) The vulnerability advisories move to the package metadata tables, and are thus renamed with the pm_ prefix.

Implementation Plan

After some research and discussion it appears that the best way forward is to split advisory storage into 2 tables (see research spike).

It's easier to copy and re-create vulnerability_advisories columns into the relevant fields in the 2 new tables rather than do a rename.

  • migration to create 2 tables
    • pm_advisories (storing generic advisory information)
    • pm_affected_packages (foreign key to pm_advisories and storing data on how a specific package is affected by an advisory)
    • ensure constraints reflect data in the GitLab Advisory Database (see #406596 (closed))
    • add overridden_advisory_fields column to pm_affected_packages (e.g. an affected package has a description different from the parent advisory)
  • copy and break up Vulnerabilities::Advisory model into 2 models for the tables above
  • add more complex validation
  • migration to drop unused tables vulnerability_advisories and sbom_vulnerable_component_versions
Old plan

Proposal 1 is chosen because it avoids package discrepancies (e.g. missing packages) between the various data sources (see discussion thread for details).

  • create migration on vulnerability_advisories
    • add purl_type column (smallint)
    • rename table to pm_vulnerability_advisories
    • rename component_name column to package_name
Edited by Igor Frenkel