Update vulnerability advisory table column constraints

Why are we doing this work

The vulnerability_advisories table was modeled to import the advisories in the advisory database, and has safeguards in place to control the size of the table e.g. an advisory description cannot be longer than 2048 characters. While importing the advisory database, we noticed that some of the advisories do not fit within these constraints. Also, some of the constraints don't match the ones defined in the JSON schema for the YAML files of the advisory database.

The status of the constraints are as follows:

DB column DB limit schema limit max length status notes
description 2048 1048576 5502 could be compressed
title 2048 255 228
component_name 2048 - ? should match sbom_components.name
solution 2048 - ? can be generated
not_impacted 2048 - ? can be generated
cvss_v2 128 (pattern) ?
cvss_v3 128 (pattern) ?
affected_range 32 - 369

Additionally, the advisory database uses characters that are not UTF-8 compliant, for example a null byte, and this causes the ingestion of the advisor to fail. We should update the constraints of the advisory table used in Advisories stored in vulnerability_advisories a... (#375302 - closed) so that it handles these edge cases.

Relevant links

For a more detailed analysis see #371046 (comment 1347512212).

Non-functional requirements

  • Documentation:
  • Feature flag:
  • Performance:
  • Testing: Verify if importing the advisory database produces no errors. The glad-to-csv can be used to produce an import script.

Implementation plan

  • Update the title limit to 255
  • Update the description limit to 8192. See #406596 (comment 1348751418).
  • Update the component_name limit to 255
  • Update the affected_range limit to 512
  • Update the links column type to jsonb to match gemnasium-db's JSON/YAML schema.
  • Remove the urls column to deduplicate data in links column.
  • Escape null bytes and other characters that do not comply with UTF-8 encoding. The null-bytes can be grepped via ripgrep like so rg '\\0'.

Verification steps

TODO

Edited by Igor Frenkel