Generation expectations for version matching using Gemnasium and GLAD

Problem to solve

Continuous Vulnerability Scanning (CVS) uses SemverDialects for version matching. However, there are unsupported edge cases, and SemverDialects isn't fully consistent with Gemnasium (which runs in the Dependency Scanning job) and its vrange library. This might result in false positives and false negatives in CVS.

Proposal

See #386072 (comment 1695853444)

Create a tool that generates expectations for version matching using the following components:

  • Gemnasium's vrange package
  • gemnasium-db AKA the GitLab Advisory Database (GLAD)
  • CVS exports of licenses exported by the license-exporter; they're used to get package versions

The tool is implemented in Go, and does the following:

  1. Go through all the YAML files of gemnasium-db, and collect the following information:
    • package type
    • package name
    • affected range
  2. Go though the CSV files of the license DB, and collect all the versions that match the packages that have been collected before.
  3. For each affected range,
    • For each package versions,
      • Do version matching using the vrange library.
      • Output the result:
        • package type
        • version range
        • version
        • result of version matching

Output format might be CVS or NDJSON.

If NDJSON then we might group versions using all the other fields (type, range, result).

  • For each affected range,
    • For each package versions,
      • Add vrange query.
    • Run vrange queries.
    • Output the result:
      • package type
      • version range
      • versions in range
      • versions NOT in range

Implementation plan

  • Create Go project that implement the proposal.
  • Document usage and/or provide automation.
  • Document output format.
  • Upload files.
Edited by Fabien Catteau