Continuous vulnerability scans
## Problem to Solve
Today, identifying new vulnerabilities requires users to run new jobs to rescan projects. This creates unnecessary steps and delays when users need to quickly understand whether they are impacted by a specific vulnerability. This can be especially problematic when critical zero-day vulnerabilities, such as [Log4Shell](https://en.wikipedia.org/wiki/Log4Shell) (log4j) need to be addressed.
Additionally, customers want to ensure that they are addressing critical dependency-related vulnerabilities in all of their projects, whether the projects are actively being maintained or not. This currently requires customers to run scheduled scans at a predefined interval, which leads to delays in detecting new vulnerabilities and creates a pileup of issues developers and security teams will have to address.
## Overview
Give users the ability to detect new dependency-related vulnerabilities that already exist in their projects, without having to run new pipeline scans. By rearchitecting dependency scanning, we will mutualize as many steps as possible, which will enable customers to be aware of dependency-related vulnerabilities more quickly. This work will enable us to start performing "cheap" vulnerability scans for Container Scanning and Dependency Scanning.
At a high-level, the scanning process is:
1. Create a list of installed software (i.e.: a dependency list, or SBOM)
2. For each item in the list above, compare the dependency name and version against the advisory database
3. Generate vulnerability findings to be ingested in the Vulnerability Management system.
Step 1 is very specific to the target being scanned (language/package manager), and we can use a variety of scanners to perform it. Step 2 and 3 are generic. Regardless of the mechanism used to produce the SBOM, the process to identify known vulnerabilities is the same, and boils-down to little more than string comparisons. Creating findings is also similar for all supported languages.
## Intended Users
* [Delaney (Development Team Lead)](https://about.gitlab.com/handbook/product/personas/#delaney-development-team-lead)
* [Sasha (Software Developer)](https://about.gitlab.com/handbook/product/personas/#sasha-software-developer)
* [Dakota (Application Development Director)](https://about.gitlab.com/handbook/product/personas/#dakota-application-development-director)
* [Amy (Application Security Engineer)](https://about.gitlab.com/handbook/product/personas/#amy-application-security-engineer)
* [Alex (Security Operations Engineer)](https://about.gitlab.com/handbook/product/personas/#alex-security-operations-engineer)
## Advantages
1. Consistent security reports across different analyzers. Because the advisory database being used is the same, a match against it will necessarily be the same regardless of what scanner produced the input SBOM.
2. Near real-time updates. Given that GitLab has a copy of the SBOM and of the advisory DB, we can trigger a "scan" as soon as either inputs are changed. The "scan" in this case is simply the comparison process between SBOM entries and the advisory DB.
3. More efficient scans. If we keep track of the changes since the last report was produced, the comparison process needs only consider what was changed in the SBOM and/or in the advisory DB. Instead of checking for hundreds entries every time, we should be able to check only a few.
4. Progress beyond CI jobs. None of the steps in the process need to necessarily run as a CI job. As long as the SBOM and the advisory DB are regularly updated, the scan process (i.e. the SBOM/advisory comparison) could run anywhere - for example, in a sidekiq job.
## Proposal
The work to deliver these reactive scans can be neatly divided into these parts:
* [x] [Ingest SBOM reports](https://gitlab.com/groups/gitlab-org/-/epics/8024 "Ingest SBOM reports"). We need the SBOM information for each project to be available in a database. Reading and parsing json, xml or any other file-based formats is slow. Our first choice is to use postgres, since it's the data store that's available to us but we need to capacity planning to ensure it's viable.
**CVS for Dependency Scanning**
* [x] [Ingest advisory databases](https://gitlab.com/groups/gitlab-org/-/epics/8025 "Ingest Dependency Scanning advisories"). Counterpart to the above, this ensures that we have a SSOT for security advisories.
* [x] [Trigger Dependency Scanning Update on Advisory DB change](https://gitlab.com/groups/gitlab-org/-/epics/9534 "Dependency Scanning: CVS Trigger scans on Advisory DB changes"). Finally, once the SBOM and advisory information is available in a database, we are able to respond to any changes by processing them and generating new reports for the affected projects. Was released in experimental state in 16.4 (Admin must toggle this on from the Security Configuration page)
* [x] [Generally Available (GA) support for Continuous Vulnerability Scans](https://gitlab.com/groups/gitlab-org/-/epics/11474 "Generally Available (GA) support for Continuous Vulnerability Scans"). Once complete, we will enable CVS by default for all Ultimate customers.
* [ ] [Trigger Dependency Scanning Update on SBOM change](https://gitlab.com/groups/gitlab-org/-/epics/8026 "Dependency Scanning: CVS Trigger vulnerability scans on SBOM changes")
**CVS for Container Scanning**
* [x] [Ingest OS package advisory data](https://gitlab.com/groups/gitlab-org/-/epics/10109 "Collect OS package advisory data into the external license DB"). Counterpart to the above, this ensures that we have a SSOT for security advisories.
* [ ] [Trigger Container Scanning Update on Advisory DB change](https://gitlab.com/groups/gitlab-org/-/epics/9532 "Container Scanning: CVS Trigger scans on Trivy DB changes") Expected to be enabled by default in 16.7
* [ ] [Trigger Container Scanning Update on SBOM change](https://gitlab.com/groups/gitlab-org/-/epics/11219 "Container Scanning: CVS Trigger Vulnerability scans on SBOM ingestion") Expected to be enabled by default in 16.8
### Proposed architecture (high level concept)
[Source](https://docs.google.com/presentation/d/1s-DbpXUQto7iNg1qR86WlC0gkwM8zu1k4ClrmnMP-Kg/edit#slide=id.g12a56d0b3ac_0_0) (internal link)

_This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc._
## As part of closing the epic
1. Share the feature availability in [this forum topic](https://forum.gitlab.com/t/grype-container-scanning-wont-fill-out-gl-dependency-scanning-report-json/70980/5?u=thiagocsf).
## Engineering DRI
@tkopel
_This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc._
<!-- triage-serverless v3 PLEASE DO NOT REMOVE THIS SECTION -->
*This page may contain information related to upcoming products, features and functionality.
It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes.
Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.*
<!-- triage-serverless v3 PLEASE DO NOT REMOVE THIS SECTION -->
epic