Update vulnerability advisory table column constraints
Why are we doing this work
The vulnerability_advisories table was modeled to import the advisories in the advisory database, and has safeguards in place to control the size of the table e.g. an advisory description cannot be longer than 2048 characters. While importing the advisory database, we noticed that some of the advisories do not fit within these constraints. Also, some of the constraints don't match the ones defined in the JSON schema for the YAML files of the advisory database.
The status of the constraints are as follows:
| DB column | DB limit | schema limit | max length | status | notes |
|---|---|---|---|---|---|
| description | 2048 | 1048576 | 5502 | could be compressed | |
| title | 2048 | 255 | 228 | ||
| component_name | 2048 | - | ? | should match sbom_components.name
|
|
| solution | 2048 | - | ? | can be generated | |
| not_impacted | 2048 | - | ? | can be generated | |
| cvss_v2 | 128 | (pattern) | ? | ||
| cvss_v3 | 128 | (pattern) | ? | ||
| affected_range | 32 | - | 369 |
Additionally, the advisory database uses characters that are not UTF-8 compliant, for example a null byte, and this causes the ingestion of the advisor to fail. We should update the constraints of the advisory table used in Advisories stored in vulnerability_advisories a... (#375302 - closed) so that it handles these edge cases.
Relevant links
For a more detailed analysis see #371046 (comment 1347512212).
Non-functional requirements
-
Documentation: -
Feature flag: -
Performance: -
Testing: Verify if importing the advisory database produces no errors. The glad-to-csv can be used to produce an import script.
Implementation plan
-
Update the titlelimit to255 -
Update the descriptionlimit to8192. See #406596 (comment 1348751418). -
Update the component_namelimit to255 -
Update the affected_rangelimit to512 -
Update the linkscolumn type tojsonbto match gemnasium-db's JSON/YAML schema. -
Remove the urlscolumn to deduplicate data inlinkscolumn. -
Escape null bytes and other characters that do not comply with UTF-8 encoding. The null-bytes can be grepped via ripgrep like so rg '\\0'.
Verification steps
TODO