Export of advisories fails because of unsupported identifiers (Rubygem)

Summary

Some advisories imported in License DB from the GitLab Advisory Database (GLAD) have identifiers that can't be parsed by ParseIdentifierID. As a result, the license-exporter fails to export advisories.

The advisories with unsupported identifiers are related to Ruby gems. See #415078 (comment 1436197829):

OSDDB-95668 for gem/builder
OSDDB-108899 for gem/brbackup
OSDDB-115917 for gem/bundler
SRCCLR-SID-3173 for gem/rails_admin

Further details

During ingestion, the backend requires all identifiers to have a type, a name, and a value. See https://gitlab.com/gitlab-org/gitlab/-/blob/4e7691d6287497ab6a5b81d7aa3de8836473e0bc/ee/app/validators/json_schemas/pm_advisory_identifiers.json#L49-52

Identifiers objects are used to generate Security reports in memory. See !121607 (diffs)

Right now Gemnasium simply skips identifiers it can't parse. See https://gitlab.com/gitlab-org/security-products/analyzers/gemnasium/-/blob/c9132154b081ea86228861d85498b10776ef1a5d/convert/vulnerability_converter.go#L76-78

Possible fixes

See #415078 (comment 1436457339)

  • Support the unsupported identifiers. 🙂
  • Skip and log unsupported identifiers but make sure that the identifier is correctly parsed. The write function should be made a method so that it can reuse the logger of export.AdvisoryNdjsonExport.
  • Update the YAML where there are typos.
  • Parse everything, but do not set the type if it cannot be detected. In that case the backend needs to be changed accordingly. Feasibility to be checked. This is more like a long term solution.

Proposal

TBD

Implementation plan

OSDDB-95668 for gem/builder should become OSVDB-95668
OSDDB-108899 for gem/brbackup should become OSVDB-108899
OSDDB-115917 for gem/bundler should become OSVDB-115917
  • When we parse identifiers we do the following:
  • First we parse the identifiers field of the GladAdvisory. If there are multiple identifiers at least one needs to be parsed correctly. If there are multiple identifiers and some of them fails we just print a warning and we continue. If the identifiers has only one identifier and it fails then the exporter fails.
  • Then we continue with cwe_ids identifiers. All of them can fail and we will just print a warning.
  • If a non standard Identifier is parsed, then we return an Identifier object that has no url and value. The requirement for this case is that the identifier contains at least one hyphen. If not the parsing should fail. We use the first part of the identifier (before the hyphen) as the type.
  • Deploy a new version on dev and prod
  • A warning should be printed in the exporter in case identifiers have not been parsed correctly
Edited by Nick Ilieskou