Cluster Image Scanning (Vulnerability Scans against Running Containers)
<!-- The first section "Release notes" is required if you want to have your release post blog MR auto generated. Currently in BETA, details on the **release post item generator** can be found in the handbook: https://about.gitlab.com/handbook/marketing/blog/release-posts/#release-post-item-generator and this video: https://www.youtube.com/watch?v=rfn9ebgTwKg. The next four sections: "Problem to solve", "Intended users", "User experience goal", and "Proposal", are strongly recommended in your first draft, while the rest of the sections can be filled out during the problem validation or breakdown phase. However, keep in mind that providing complete and relevant information early helps our product team validate the problem and start working on a solution. -->
### Release notes
<!-- What is the problem and solution you're proposing? This content sets the overall vision for the feature and serves as the release notes that will populate in various places, including the [release post blog](https://about.gitlab.com/releases/categories/releases/) and [Gitlab project releases](https://gitlab.com/gitlab-org/gitlab/-/releases). " -->
### Problem to solve
<!-- What problem do we solve? Try to define the who/what/why of the opportunity as a user story. For example, "As a (who), I want (what), so I can (why/value)." -->
Although customers are able to do container scanning as part of their pipeline jobs today, there is no guarantee that the images for the containers that are running in production have been scanned recently. Some customers have production container images that were deployed several years ago and have not been updated. Users need a way to regularly re-scan the container images that are actually running in production so they can understand their current security risk.
### Intended users
<!-- Who will use this feature? If known, include any of the following: types of users (e.g. Developer), personas, or specific company roles (e.g. Release Manager). It's okay to write "Unknown" and fill this field in later.
Personas are described at https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/
* [Cameron (Compliance Manager)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#cameron-compliance-manager)
* [Parker (Product Manager)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#parker-product-manager)
* [Delaney (Development Team Lead)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#delaney-development-team-lead)
* [Presley (Product Designer)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#presley-product-designer)
* [Sasha (Software Developer)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#sasha-software-developer)
* [Devon (DevOps Engineer)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#devon-devops-engineer)
* [Sidney (Systems Administrator)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#sidney-systems-administrator)
* [Sam (Security Analyst)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#sam-security-analyst)
* [Rachel (Release Manager)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#rachel-release-manager)
* [Alex (Security Operations Engineer)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#alex-security-operations-engineer)
* [Simone (Software Engineer in Test)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#simone-software-engineer-in-test)
* [Allison (Application Ops)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#allison-application-ops)
* [Priyanka (Platform Engineer)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#priyanka-platform-engineer)
* [Dana (Data Analyst)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#dana-data-analyst)
-->
Primary Personas:
* [Devon (DevOps Engineer)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#devon-devops-engineer)
* [Delaney (Development Team Lead)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#delaney-development-team-lead)
* [Cameron (Compliance Manager)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#cameron-compliance-manager)
Secondary Personas:
* [Sasha (Software Developer)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#sasha-software-developer)
* [Sam (Security Analyst)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#sam-security-analyst)
* [Alex (Security Operations Engineer)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#alex-security-operations-engineer)
### User experience goal
<!-- What is the single user experience workflow this problem addresses?
For example, "The user should be able to use the UI/API/.gitlab-ci.yml with GitLab to <perform a specific task>"
https://about.gitlab.com/handbook/engineering/ux/ux-research-training/user-story-mapping/ -->
### Proposal
<!-- How are we going to solve the problem? Try to include the user journey! https://about.gitlab.com/handbook/journeys/#user-journey -->
1. Users will be able to schedule regular container scans of the images that were used to initialize the containers that are running in their production environment
1. The container scan will identify known vulnerabilities (CVEs) in the OS and in the packages that are installed on those images
1. The scan findings will be displayed on the Vulnerability Report under a new `Operational Vulnerabilities` tab.
1. Users will be able to filter the operational vulnerability results by which Kubernetes cluster those results came from. The "cluster" will map back to clusters connected via GitLab Kubernetes Agents.
1. The scanner will not require additional credentials beyond what is already collected when connecting a Kubernetes cluster to GitLab.
1. Users will be able to view vulnerabilities related to a specific cluster in a new `Security` tab when viewing agent-connected clusters. This tab will not be available for certificate-connected clusters.
**Note:** These requirements have been updated from the original list to remove support for Kubernetes clusters connected via the certificate method given that connection method has been deprecated.
#### Designs (please see [Design issue](https://gitlab.com/gitlab-org/gitlab/-/issues/219173) for more details)
- 🎨 [Figma file](https://www.figma.com/file/w1foPwswNuOKWgi2IvLHZm/Running-container-vulnerabilities?node-id=175%3A0)
- 📽 [Video walkthrough](https://www.loom.com/share/904a1b795833416b9e251cea68ddcab7)
- 🎟 [Design issue](https://gitlab.com/gitlab-org/gitlab/-/issues/219173)
**Vulnerability report**
| `Development vulnerabilities` (this is the existing vulnerability report) | `Operational vulnerabilities` | Empty state (no policies) | Empty state (no vulnerabilities) |
| ------ | ------ | ------ | ------ |
|  |  |  |  |
**Cluster detail page**
| Agent managed cluster (security tab) | Agent managed cluster (access tokens tab) |
| ------ | ------ |
|  |  |
### Further details
<!-- Include use cases, benefits, goals, or any other details that will help us understand the problem better. -->
### Implementation Details
#### 1. Iteration 1 - add ability to start Cluster Image Scanning job
Requirements:
* user can get vulnerability reports from the cluster where Starboard Operator is configured,
* user can provide additional token in Kubernetes cluster settings in GitLab to provide token to `vulnerability-viewer` service account (with `get`/`list` permissions to `vulnerabilityreports.aquasecurity.github.io`),
* documentation is updated with information on how to install and use Starboard Operator with GitLab,
How we can achieve that?
1. Implement new template (`lib/gitlab/ci/templates/Security/Cluster-Image-Scanning.gitlab-ci.yml`) that will responsible for performing the scan.
1. Implement new analyzer (like https://gitlab.com/gitlab-org/security-products/analyzers/cluster-image-scanning) to get the results from the cluster.
1. Update the documentation with additional information about cluster image scanning
1. Extend the `Enums::Vulnerability::REPORT_TYPES` const with new report type `cluster_image_scanning`.
#### 2. Iteration 2 - add ability to schedule Cluster Image Scanning job periodically
Requirements:
* user can schedule Cluster Image Scanning job using Scheduled Scan Execution Policies,
* documentation is updated with information how to configure Scheduled Scan Execution Policy to start Cluster Image Scanning job,
How we can achieve that?
1. Extend service responsible for scheduling security jobs (implemented in https://gitlab.com/gitlab-org/gitlab/-/issues/325230) with ability to schedule Cluster Image Scanning Scan
#### 3. Iteration 3 - use Kubernetes Agent to fetch results
**Note:** Before doing this iteration we need to understand and find a place to put these vulnerabilities from Kubernetes Agent. In previous iterations, we are reusing the current mechanism (successful pipeline for default branch -> creates vulnerabilities from scanner JSON report into the database), in this iteration we cannot use that mechanism. We need to detach vulnerability from the pipeline and store it (both on the UI and on the backend side) differently. Currently, we are working on the design for that: https://gitlab.com/gitlab-org/gitlab/-/issues/219173/
Requirements:
* user can fetch vulnerabilities from Starboard Operator through Kubernetes Agent,
* user sees vulnerabilities found in running containers in Security Dashboard in Container Tab (https://gitlab.com/gitlab-org/gitlab/-/issues/219173/)
How we can achieve that?
1. Extend `kas` to support reading vulnerabilities from Starboard Operator (similar to what we did with Cilium Alerts: https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/merge_requests/211) (we need to add new `ClusterRole` to `list`/`get` `vulnerabilityreports.aquasecurity.github.io`)
1. Extend `API::Internal::Kubernetes` (https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/ee/api/internal/kubernetes.rb#L12) to support creating vulnerabilities for related projects.
### Permissions and Security
<!-- What permissions are required to perform the described actions? Are they consistent with the existing permissions as documented for users, groups, and projects as appropriate? Is the proposed behavior consistent between the UI, API, and other access methods (e.g. email replies)?
Consider adding checkboxes and expectations of users with certain levels of membership https://docs.gitlab.com/ee/user/permissions.html
* [ ] Add expected impact to members with no access (0)
* [ ] Add expected impact to Guest (10) members
* [ ] Add expected impact to Reporter (20) members
* [ ] Add expected impact to Developer (30) members
* [ ] Add expected impact to Maintainer (40) members
* [ ] Add expected impact to Owner (50) members -->
There will be no change to current permission levels for the Security Dashboard or the Vulnerability Report.
### Documentation
<!-- See the Feature Change Documentation Workflow https://docs.gitlab.com/ee/development/documentation/workflow.html#for-a-product-change
* Add all known Documentation Requirements in this section. See https://docs.gitlab.com/ee/development/documentation/feature-change-workflow.html#documentation-requirements
* If this feature requires changing permissions, update the permissions document. See https://docs.gitlab.com/ee/user/permissions.html -->
1. Documentation will be added to describe how to start using this feature and to schedule a container scan against a production environment
1. Existing documentation about the [Security Dashboard and Vulnerability Report](https://docs.gitlab.com/ee/user/application_security/security_dashboard/) will be edited to note that findings from Container Scans against production environments will be displayed.
### Availability & Testing
<!-- This section needs to be retained and filled in during the workflow planning breakdown phase of this feature proposal, if not earlier.
What risks does this change pose to our availability? How might it affect the quality of the product? What additional test coverage or changes to tests will be needed? Will it require cross-browser testing?
Please list the test areas (unit, integration and end-to-end) that needs to be added or updated to ensure that this feature will work as intended. Please use the list below as guidance.
* Unit test changes
* Integration test changes
* End-to-end test change
See the test engineering planning process and reach out to your counterpart Software Engineer in Test for assistance: https://about.gitlab.com/handbook/engineering/quality/test-engineering/#test-planning -->
1. Tests will be performed to verify that this feature continues to work when users have enabled Container Network Security and Container Host Security with the default settings
1. Tests will be performed to assess the performance impact of running a scan against a cluster
1. Tests will be performed to verify that running a scan does not interfere or prevent the production application from continuing to run and service requests during the duration of the scan
### What does success look like, and how can we measure that?
<!-- Define both the success metrics and acceptance criteria. Note that success metrics indicate the desired business outcomes, while acceptance criteria indicate when the solution is working correctly. If there is no way to measure success, link to an issue that will implement a way to measure this. -->
### What is the type of buyer?
<!-- What is the buyer persona for this feature? See https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/buyer-persona/
In which enterprise tier should this feature go? See https://about.gitlab.com/handbook/product/pricing/#four-tiers -->
Use of the scans will be available down to ~"GitLab Core"
Viewing the scan findings in the Security Dashboard and Vulnerability Report will be limited to ~"GitLab Ultimate"
### Is this a cross-stage feature?
<!-- Communicate if this change will affect multiple Stage Groups or product areas. We recommend always start with the assumption that a feature request will have an impact into another Group. Loop in the most relevant PM and Product Designer from that Group to provide strategic support to help align the Group's broader plan and vision, as well as to avoid UX and technical debt. https://about.gitlab.com/handbook/product/#cross-stage-features -->
### Links / references
<!-- triage-serverless v3 PLEASE DO NOT REMOVE THIS SECTION -->
*This page may contain information related to upcoming products, features and functionality.
It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes.
Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.*
<!-- triage-serverless v3 PLEASE DO NOT REMOVE THIS SECTION -->
epic