Add DB Schema changes for ingesting OS package advisories
Goal
Prepare the license-db to ingest OS package advisories. The source of the advisories is the trivy-db-glad.
Related links
Deliverable
- Add the required migration scripts to the schema project
- Deploy the migration scripts on dev and prod
Requirements
- Store raw data in a single
jsonbcolumn. - Extract values needed to support UPSERTs, and store them in separate DB columns. These values are NOT removed from the raw JSON document. This is for indexing only.
- Process the raw data during the export.
Pros & cons
- The ingestion is simple.
- The export can evolve and leverage more fields w/o re-ingesting anything. For instance, in the future we could export YAML fields of the GLAD advisories we haven't leveraged so far. We wouldn't need to re-ingest any YAML file.
- This makes the export complex, relative to the ingestion.
Relates to
#422869 (comment 1522730005)\+
Implementation Plan
We need to introduce the following tables
| Table | trivy_db_advisory |
|---|---|
| id | BIGSERIAL (PK) |
| cve_id | text |
| last_updated | timestamp |
| advisory | jsonb |
| Table | trivy_db_package |
|---|---|
| id | BIGSERIAL (PK) |
| distro_name | string |
| distro_version | string |
| package_name | string |
| advisory_id | BIGSERIAL (FK) |
| node_value | jsonb |
| last_updated | timestamp |
trivy_db_advisory
This table contains all the advisories defined by the vulnerability bucket in the trivy-db.
-
id: is a PK. Just a sequential big int -
cve_id: It will contain the vulnerability identifier as a string -
last_updated: a timestamp that will be updated on any modification -
advisory: a jsonb column that contains the vulnerability from the trivy-db as is. Without any modifications.
trivy_db_package
This table contains mapping of trivy-db vulnerabilities to OS packages. It also contains information about fixed versions.
-
id: A sequential big int PK -
distro_name: the name of the distro. For examplealpine -
distro_version: the version of the distro. For example3.1. -
package_name: the name of the package. For examplecurl. -
advisory_id: the id of the advisory. This field is a FK. -
node_value: This value contains a json object containing one or more fields of the trivy-db Advisory struct. Usually OS packages have theFixedVersionfield which is what we intent to support in the first iteration.
Unique constrain: the combination of distro_name, distro_version, package_name and advisory_id.
Tasks
-
Introduce a migration script that adds these tables in the schema project. -
Deploy on dev -
Deploy on prod
Edited by Nick Ilieskou