Add version parsing errors to CVS metrics

Problem to solve

Metrics for Continuous Vulnerability Scanning (advisory scans) count the following:

  • "possibly affected" projects and SBOM occurrences; these match the affected package
  • "known affected" projects and SBOM occurrences; these also match the affected version

However, we don't how many projects and SBOM occurrences aren't considered as affected because the version can't be compared to the affected range of versions. In particular, we don't have metrics for the following cases:

  • Version can't be parsed.
  • Version is not comparable.
  • There's no version.

As a consequence, we don't know how important it is to warn users about versions that are missing or can't be compared. We can't measure the need for Highlight invalid and missing versions in Depen... (#464007).

We can't measure an improvement or a regression in version parsing errors when upgrading the backend to a new version of semver_dialects, which is used to compare a version to an affected range of versions. See Upgrade to semver_dialects v3 (#462857 - closed) for instance.

Further details

AdvisoryScanner logs version parsing errors, but it doesn't count them.

AdvisoryScanner relies on the PossiblyAffectedOccurrencesFinder, which excludes SBOM occurrences that don't have a version.

Some versions like dev-master (common in PHP projects) can't be compared. See Component version can't be compared to affected... (#442027)

Proposal

The first step would be to count version parsing errors and to add this to the counts object of CVS metrics.

Implementation

This has been implemented in Add metric to track when CVS cannot scan a comp... (!166928 - merged) in order to address Add metric to track when CVS cannot scan a comp... (!166928 - merged).

NOTE: In Tableau the new counter is not displayed in any dashboard. We need to make that accessible somehow, and trigger an alert when it goes beyond a threshold.

Results

It happens that SemverDialects errors are negligible, and so there's no need to distinguish b/w the various errors at this point.

See #465865 (comment 2150566045)

Over the past 7 days, only 0.01% of possibly_affected_sbom_occurrences have encountered a SemverDialect error. So I guess what we have for now is good enough to observe the feature?

/cc @thiagocsf @hacks4oats @johncrowley

Edited by Fabien Catteau