Auto Remediation vision (#759) · Epics · Epics · GitLab.org

Auto Remediation vision

## Executive summary This document outlines GitLab’s approach to improving the user experience behind triaging and remediating open-source vulnerabilities through automation. Details included in this document will outline market needs, problems to be solved, competitive analysis, functionality, and user experience goals. ### Auto-remediation Auto-remediation, in the context of software composition analysis (SCA), is the automated process of identifying and resolving security vulnerabilities for a software application without human intervention. ### Problems to be solved Composition analysis scans create a great deal of noise. Users are bombarded with a high number of CVEs that require triage and remediation. In 2024 alone there were over 40,000 CVEs disclosed through the National Vulnerability Database. This causes many problems for organizations - a lack of focus on CVEs that actually matter, a build up of security debt, failed audits, blocking of FedRAMP ATOs, and a reduced focus on feature delivery. From manually keeping their dependencies up to date or clearing vulnerabilities, these time-consuming tasks can impede on engineering productivity or put their customer applications at risk. See [User Journey Map](https://www.figma.com/board/1RLcEQL0oAMEpKSBifoH7U/CA---Auto-remediation-User-Journey?node-id=0-1&p=f&t=9IrAiYVUdTFlIjDw-0) for overview of the workflows required to remediate open-source vulnerabilities. **Delays in remediation time**: users must triage vulnerabilities, which involves understanding upgrade options, potential breaking changes, version conflict resolution, and actually applying the change. This could spiral if there is a complex dependency chain, with transitive dependencies also requiring upgrade. These delays in remediation time also increase security costs, through lost productivity of developer and AppSec personas who need to go through the triage process, patch creation, dependency resolution, and the MR approval process. Auto-remediation allows developers to focus on feature delivery and AppSec to focus on other aspects of their job. **Human initiated remediation is reactive**: CVEs can be ignored and dependency-versions can get further out of date. Humans have other priorities in their day-to-day job and are trying to ship features or can only handle so many CVE remediations. By introducing auto-remediation we will introduce proactive security that reduces security debt and frees up developers to focus on feature delivery. **Breaking changes are painful for users**: A simple version increment can lead to breaking changes that are difficult to mitigate and eat up a developer’s time. ### Strategy and Themes GitLab's vision for Auto Remediation will increase engineering productivity and put security findings into action that will help both small and large enterprise organizations. This vision also overlaps with enabling our [Software Supply Chain Security](https://about.gitlab.com/direction/supply-chain/) strategy. Auto Remediation offers rich workflows that aim to: * Streamline fixes via the merge request workflow to achieve quicker remediation without hassle. * Offer flexible rules that match a team's workflow and risk. * Schedule fixes that factor in signal to noise while reducing the cognitive load for the team. ## Requirements ### Functionality Users who opt in to using our auto-remediation feature will have a merge request generated to increment dependency versions. This workflow will trigger immediately post-SCA scan. The scan will surface vulnerabilities, at which point we will examine the auto-remediation configuration and target specific vulnerabilities for version increment. A merge request will be generated by a bot user and will be required to flow through the merge request approval process as defined by an organization. Composition Analysis will select a safe upgrade version and we will use AI to improve this version selection and resolve breaking changes. The merge request will include information as to how we arrived at the upgrade version that was selected. Users will be able to track auto-remediated vulnerabilities on the Vulnerability report. #### Version Selection We will rely on advisory data to designate the safe version range for an affected vulnerability. By default we will select the most recent safe version of the dependency. In the case where we wrap an open-source tool to achieve some of the core auto-remediation functionality then we will rely on their version module to achieve version selection. #### AI Resolution To improve our auto-remediation capabilities and reduce breaking changes for our users, we will use AI to identify and resolve breaking changes. If a version upgrade for a dependency has a likelihood of introducing a breaking change, we will use AI to provide a code fix in the MR to resolve this breaking change. In cases where an AI-provided fix is not able to be generated then we will move forward with the safe version increment based on the advisory data, defaulting to the most recent safe version. In cases where the organization does not subscribe to Duo, this functionality will not be applied. #### Enabling Auto-remediation is a net new functionality for GitLab Composition Analysis users. We should introduce this feature set through feature flags or configuration options for users. This will allow for optionality when selecting which projects should have auto-remediation enabled. The rationale here is there are likely business critical applications that should not leverage auto-remediation. Users should be able to turn on auto-remediation at the group level and then opt individual projects. This prevents users from having to turn on functionality on a project-by-project basis, which would prove painful for larger customers. Though, in the near-term if we are unable to support group level enablement then handling on a project-by-project basis will be sufficient until a more elegant solution is rolled out. #### Configuration options We should allow users to configure “remediation targets” through Severity levels. For example, a user could define that they only want to auto-remediate vulnerabilities that are MEDIUM or below severity, as they want humans to closely examine vulnerabilities that are HIGH or CRITICAL. This will happen through CI/CD variables. We should also allow a user to define the version type (i.e., major, minor, patch) that is permissible for update. This provides users flexibility to treat different projects in different ways. This will happen through CI/CD variables. In the future there will be other remediation targets that will permit users to target vulnerabilities for remediation based on EPSS, KEV, and other package metadata that we make available. This is not a GA requirement. #### Language support We should have as broad language support as possible. Ideally we are able to create parity with our [scanning language support](https://docs.gitlab.com/user/application_security/dependency_scanning/#supported-languages-and-package-managers). This will set us apart from some industry peers, but also have a larger impact on usage for our customers, ultimately solving more of their problems with vulnerability fatigue. #### Safe version selection The Composition Analysis team should identify a safe version to use as the upgrade path for the dependency. In cases where there is no safe version this would disqualify the dependency’s vulnerability from being auto-remediated. The Sec-AI team will perform an experiment to attempt to optimize the version selection process with AI. The goal of their experiment will be to reduce / resolve version conflicts as well as identify and resolve breaking changes. #### MR Creation MR creation will happen immediately post-scan. A user will see a merge request created by a bot user. It will be assigned to whomever initiated the pipeline run so there is accountability and merge requests are not going unreviewed. It is important that we use a bot user as the creator - we do not want to attribute these changes to a human who was only committing code and not necessarily incrementing a version to solve for a vulnerability. The MR will include details about how the safe version was selected to alert the user to AI-generated upgrades. ### User experience We need to provide users the ability to easily enable, configure, and leverage auto-remediation to reduce vulnerability fatigue. To accomplish this there will be 3 different areas of focus: #### Enable _Dependency: Security Platform Management_ For initial maturity level releases (experiment & beta) we will rely on a yml to handle enabling this feature. However, to get to GA we need to allow users to more elegantly enable this feature at the group level. This should be accomplished through the Security Inventory user interface. #### Configure _Dependency: Policies & Security Platform Management_ Configuration will be focused on two areas: 1. Severity levels: allows a user to create a threshold at which we do or do not auto-remediate vulnerabilities, based on the severity that has been assigned. 2. Version selection: we should allow users to define if they want major, minor, or patch version increments. This allows for flexibility and for our features to accommodate the broad use cases across our customers. For experiment and beta releases we will allow users to alter yml files to make these configuration options. To mature the feature users should be able to define a policy to achieve both of the aforementioned configurations. #### Vulnerability report _Dependency: Security Insights_ Immediate post-scan, vulnerabilities will be created and visible on the vulnerability report. Users should be able to establish the set of vulnerabilities that were auto-remediated. There are two scenarios that we need to accommodate: 1. A dependency scan has run, auto-remediation has initiated and merge requests with the version increment are awaiting review / approval. 1. In the **Activity** field we should show **Has merge request**. 2. Users will be able to identify auto-remediated vulnerabilities by filtering on the **Activity** field = **Vulnerability remediated** There will likely be other changes to the vulnerability report that more clearly state auto-remediation activity, though this requires cross-group input as there are likely impacts to Vuln Resolution for SAST. #### Merge Requests Because we are generating merge requests without human intervention we need to provide clear insight to users about what changes. Merge Requests should have the following details: | Field | Value | |-------|-------| | Title | Bump \<library name\> from \<old version\> to \<new version\> | | Description | Bumps \<library name\> from \<old version\> to \<new version\>. | | Vulnerabilities fixed | List vulnerabilities remediated | | Version selection | If we used AI: _We used GitLab Duo to identify a safe version to upgrade to. This version avoids breaking changes._ If we did not use AI: _This is the most recent safe version that was identified by the author of this library._ | | Version Conflicts | Add a note at the end: _We will resolve any conflicts with this MR as long as it is not altered._ | | Author | Security Bot | ## Maturity levels ### Experiment * Users are able to turn on auto-remediation on a project-by-project basis * Use Dependency Scan results to identify vulnerabilities that qualify for auto-remediation * Focus remediation on LOW severity vulnerabilities * Patch & Minor version remediation ONLY. No major versions. * Establish a proposed upgrade path (non-AI solution) * Leverage AI-experiment upgrade path to prevent breaking changes & perform version resolution * Generate MRs with the update to manifest files * MR is generated by a bot user * MR will be generated post-scan and require no human input outside of MR approval policies defined by an organization * Metric instrumentation: Auto-remediated MRs accepted/rejected - tranche by severity & package manager #### Experiment Dependencies * Composition Analysis * Sec-AI Experiment team ### Beta * Configuration variable for multiple levels of severity * Configuration variable for version target (Major, Patch, Minor) * Allow users to group dependency version bumps into one MR * This should be configurable * UI: when a merge request has been created by auto-remediation, vulnerabilities that have been auto-remediated should have the Status field altered to **Confirmed** and the Activity Field should show **Has Merge Request** * Metric instrumentation: auto-remediated MRs accepted/rejected by severity and upgrade version #### Beta Dependencies * Composition Analysis * Security Insights ### General Availability * Complete metric instrumentation * Incorporate feedback from Beta to make general improvements * UI-based administration to turn auto-remediation on at the Group level and have that cascade to all projects within a Group #### General Availability Dependencies * Composition Analysis * Security Platform Management ### General Availability Follow on improvements * Container Scanning support * UI element \[Button\] to allow a user to remediate a vulnerability with a safe version. Similar to [Vulnerability Resolution](https://docs.gitlab.com/user/application_security/vulnerabilities/#vulnerability-resolution). * Support for additional vulnerability characteristics that would trigger an auto-remediation event * EPSS * KEV * End of Life * Likely others * Granular configuration rules (policies) #### Future state The immediate focus for auto-remediation will be a fully automated solution that creates MRs on behalf of users. There is a use case for user initiated auto-remediation. This functionality would allow a user to click a button on the vulnerability details page to generate an MR. The rationale here is that providing this shortcut to create a patch MR would still provide value through streamlining workflows and piggybacking off the full auto-remediation solution. Adding this functionality also aligns with the workflows certain users are accustomed to with Vulnerability resolution for SAST vulnerabilities. This work was not part of the initial roll out because of the nature of SCA vulnerabilities. Manually clicking a button for thousands of scan results is not solving their vulnerability fatigue problem. #### What is not planned right now * Auto Remediation of Dockerfiles * Auto Remediation enforcement through Security Policies * Auto patch dependencies (non-vulnerable packages) ## Success measures 1. **Auto-remediated MRs accepted/rejected - tranche by severity & package manager** 1. Shows that the MRs we are creating on behalf of the user are being merged, thus reducing the need for humans to perform manual remediation. 2. There is an interest in understanding if there is a different acceptance rate for different severity levels or package managers. If there are differing acceptance rates then we should dig in deeper to understand why this is happening. 2. **Breaking changes prevented / introduced** 1. Identifies the number of breaking changes that we are able to prevent. Indicates accuracy of our AI-focused work. 3. **Version conflicts resolved with AI** 1. This ties back to usability and understanding how we are positively impacting our user. ## First Teams There will be multiple phases of development, some first team pairings will include: | Phase | First Team | |-------|------------| | Create baseline AR functionality | Composition Analysis | | AI experiment + Auto-remediation | Sec-AI exp + Composition Analysis | | Update Vulnerability report | Security Insights + Composition Analysis | | Policy to define severities for AR | Policies + Composition Analysis | | Allow for UI-focused configuration. | Security Inventory + Composition Analysis | ### Target Audience * [Sasha (Software Developer)](https://about.gitlab.com/handbook/product/personas/#sasha-software-developer) * [Amy (Application Security Engineer)](https://about.gitlab.com/handbook/product/personas/#amy-application-security-engineer) * [Alex (Security Operations Engineer)](https://about.gitlab.com/handbook/product/personas/#alex-security-operations-engineer) ### Pricing and Packaging ~"GitLab Ultimate" ### Analyst Landscape - From the [2023 Gartner Quadrant for Application Security Testing](https://drive.google.com/file/d/18qxkpTst7CXff1pJTFVJUEn6PSlvS3Pi/view?usp=drive_link), Auto Remediation was noted as a key strength in leaders in the space including Checkmarx and Snyk. - Customers pursuing SOC2 and FedRAMP accreditation will find Auto Remediation as a way to alleviate their audit process.

epic