Extend advisory-feeder to ingest all trivy-db data (without a cursor)
Goal
The goal is to extend the advisory-feeder with a new source of data. Trivy-db is the new source of data containing advisories about OS packages. Even if trivy-db is the real source, in reality we get our data from trivy-db-glad.
Relevant links
- #422869 (comment 1522612583)
- Extend advisory-feeder to ingest latest changes... (#423391 - closed)
- Gitlab container registry API
- Trivy-db advisory structure fields
Introduction
In order to avoid creating a big weight issue I will divide the implementation in two parts:
- This issue will be about implementing the ingestion of all trivy db packages and vulnerabilities. This means that we won't be comparing with the last processed
trivy.dbfile. - The second issue will be about extending the advisory-feeder to check the cursor and find the diff between the last trivy.db and the last processed trivy.db file.
Requirements
- The user needs to specify in the
advisory-feedercommand what is the source of advisory data. This can be done through a flag--source=glador--source=trivy-db - For this issue
--internal-bucketthat specifies the bucket that contains the cursor will not be used. - The user needs to specify the topic to be used. For trivy-db as a source we will introduce two new flags
--send-topic-advisoriesand--send-topic-os-pkg. - We do not need to ingest application packages (aka package manager packages). So we can ignore all trivy.db buckets that contain
::in their name. - In this first iteration we won't ingest
Red Hatpackages. Consequently we also ignoreRed Hat CPEbucket.
Implementation plan
-
Introduce the new --sourceflag. -
Introduce the required pubsub topic flags for trivy-db. -
Rename --send-topicflag to--glad-topic -
Modify main so that we can differentiate between the two sources. -
Introduce a new package feeders/advisory/trivy-db. This package is responsible for using the Gitlab Registry API to download the latest image, unzip it, read all the data in memory and sending them to the correct topic. We can use the bbolt library to read thetrivy.dbfile. -
Introduce unit test wherever possible. -
Verify that we publish all the information. -
Release advisory feeder with trivy-db capabilities -
Update run_feeder.sh according to the new cli flags. -
Update the CI job to run the feeder. -
Update the advisory feeder binary version. -
Update scheduled advisory feeder jobs so that they introduce the env var ADV_FEEDER_SOURCE=glad. Both on dev and prod. -
Update Feeder testing instructions on a need basis. -
Update Creating scheduled pipelines for license-feederwith the new flags that have been added.
Edited by Nick Ilieskou