Auto-remediation of custom-code vulnerabilities
This issue aims at defining a project for auto-remediation of custom-code vulnerabilities.
This is an early DRAFT, to be refined
Vision
Our goal is the provide an automated process for fixing SAST findings.
Vulnerabilities discovered by SAST are not easy to fix and requires significant efforts by the developers.
Providing an automation for triaging and fixing SAST findings will lead to significant costs reductions for our customers, in terms of workforce. Another consequence will be the reduction of friction between Security and the developers, leading to a wider adoption of Security products, in particular SAST.
Screenshot of generated MR
Beyond SCA remediation - Fix custom code
Remediation of custom-code vulnerabilities must not be confused with remediation of third-party vulnerabilities.
When a known vulnerability (usually a CVE) in discovered in a third-party library used by an application, the remediation usually consists in upgrading this library to a newer version, not vulnerable.
We already have some auto-remediation for this kind of vulnerabilities (SCA findings).
See https://docs.gitlab.com/ee/user/application_security/vulnerabilities/#vulnerability-resolution
There are already ongoing efforts for improving this auto-remediation process and deliver it as a GitLab Workflow.
See Summit 2024 Code Challenge implementation details (#451369 - closed) • Unassigned
While fixing an SCA finding is basically a simple upgrade of the vulnerable library, we may encounter some complications when the version bump conflicts with constraints for other libraries. But in most cases, it doesn't require complex manipulation of source code.
Fixing SAST findings requires another level of complexity,
involving delicate manipulation of the source code.
The generated source code must satisfy several constraints:
- Be syntactically and semantically correct
- Keep the existing business logic
- Fix the vulnerability
- Obey linting rules and code standards
- Fix relevant unit tests
- Fix relevant integration tests
Technical solutions
A technical solution might be based on a rule-based method (see https://www.pixee.ai/ and https://codemodder.io/). This is a very effective method for most basic cases. We can also think about using generative AI, either for generating part of the code fixes, or for generating rules for feeding the rule-based method.
This is a use case of GitLab Workflow
Generated-fix quality
The main technical challenge here is to generate high-quality fixes.
Ideally, most of the process will be automated and will require only a short review from a developer.
The quality of the generated fixes must be good enough
so that the efforts invested in the review are significantly lower than
the efforts that would have been required for implementing the fix manually.
Most of the time the MR should be good enough to be approved without any changes.
LLM-based fix generation
The following information should be given as context:
- Kind of vulnerability
- Relevant code snippets
- Attack vector (source, sink and intermediate nodes from source to sink)
- High-level guidance about the expected fix (sanitize input, disable XML external entities, ...)
- Additional guidance (use relevant sanitation method from standard library, implement custom sanitation in separate method, avoid making additional changes not required for fixing the vulnerability)
Expected user experience
Fixing new vulnerabilities on MR
- SAST is running automatically on MR
- SAST finds a vulnerability
- Auto-remediating builds a fix the for SAST finding
- MR containing the fix is sent for review to the MR assignees
- If necessary, the reviewers potentially interact with the AI agent (Duo Workflow) in the MR
- The fix is approved and merged to the original MR
- SAST is running again, confirming that the vulnerability has been fixed
Fixing existing vulnerabilities on main branch
- SAST is running automatically on main
- SAST finds vulnerabilities
- Auto-remediating selects regularly a few SAST finding
- MR containing the fix for those findings is sent for review to relevant reviewers
- If necessary, the reviewers potentially interact with the AI agent (Duo Workflow) in the MR
- The fix is approved and merged
- SAST is running again, confirming that the vulnerability has been fixed
