Use unified affected ranges in Gemnasium Vulnerability DB
Summary
The YAML schema for the advisories of the Gemnasium Vulnerability Database is generic, except for affected_range
, the range of affected versions: the version range syntax is specific to the package manager. For instance, the affected_range
of a security advisory for a Ruby gem follows the Gem Specification.
Proposal:
- Make Maven's version range syntax the only version range syntax for
affected_range
. This syntax is already supported, and the implementation is in Go. - Implement version comparison in Go, for all supported version syntaxes.
This should greatly simply Gemnasium, as well as the tools used to maintain its Vulnerability Database.
Combined analyzers/gemnasium
and gemnasium/semver
contain all the code to compare versions in the syntaxes supported by GitLab Dependency Scanning, but this code needs to be revisited and integrated.
Improvements
Dependency Scanning analyzer (Gemnasium)
- The analyzer is made of a binary, and is easier to package, deploy and install.
- It has fewer dependencies, and can directly run on various distributions like Alpine, Debian, or RedHat UBI. (As long as the binary format is supported and that it's statically linked, or that the system libraries needed by the CLI are provided.)
- Conversely, it's easier to support additional distributions.
- There's less work to support an additional package manager that uses a different version range syntax.
- The CI pipeline is simpler, and runs faster.
- The Docker images are smaller.
- We reduce the risks of incompatibilities b/w a version range script and a version the interpreter, like PHP, or a version of the package manager it uses, like PHP Composer.
- The analyzer project is simpler, and it's easier to contribute to it.
- In particular, the internal communication b/w Gemnasium's scanner and it's version range scripts is removed.
- It's easier to run the analyzer locally, because it has fewer dependencies.
- There are less vulnerability findings in the analyzer projects.
- It makes it less likely that the build breaks b/c a system dependency is no longer available. See #364247 (comment 970280614)
Vulnerability Database (gemnasium-db)
- The tools used to feed the database are simpler.
- There's less work to support an additional package manager that uses a different version range syntax. In particular, the YAML schema and its documentation don't change.
Risks
Users might run older versions of Gemnasium that don't support the unified version range syntax proposed in this issue. This is unlikely, but possible. To mitigate this, we should:
- Introduce a new field for the new unified syntax, like
unified_affected_range
. - Deprecate
affected_range
, and communicate on its removal. In the meantime, new YAML files have both fields. - Drop
affected_range
. Older versions of Gemnasium that don't processunified_affected_range
will simply fail.
Involved components
Optional: Intended side effects
Dependency Scanning runs slightly faster, because it doesn't rely on subprocesses. This is probably not significant though.
The CI pipeline of the analyzer project runs faster, because we no longer have one CI job per version range script, to run its unit tests.