Auto-remediation of custom-code vulnerabilities

This issue aims at defining a project for auto-remediation of custom-code vulnerabilities.
This is an early DRAFT, to be refined

Vision

Our goal is the provide an automated process for fixing SAST findings.
Vulnerabilities discovered by SAST are not easy to fix and requires significant efforts by the developers.

Providing an automation for triaging and fixing SAST findings will lead to significant costs reductions for our customers, in terms of workforce. Another consequence will be the reduction of friction between Security and the developers, leading to a wider adoption of Security products, in particular SAST.

Screenshot of generated MR

image

Beyond SCA remediation - Fix custom code

Remediation of custom-code vulnerabilities must not be confused with remediation of third-party vulnerabilities.

When a known vulnerability (usually a CVE) in discovered in a third-party library used by an application, the remediation usually consists in upgrading this library to a newer version, not vulnerable.
We already have some auto-remediation for this kind of vulnerabilities (SCA findings).
See https://docs.gitlab.com/ee/user/application_security/vulnerabilities/#vulnerability-resolution

There are already ongoing efforts for improving this auto-remediation process and deliver it as a GitLab Workflow.
See Summit 2024 Code Challenge implementation details (#451369 - closed) • Unassigned

While fixing an SCA finding is basically a simple upgrade of the vulnerable library, we may encounter some complications when the version bump conflicts with constraints for other libraries. But in most cases, it doesn't require complex manipulation of source code.

Fixing SAST findings requires another level of complexity, involving delicate manipulation of the source code.
The generated source code must satisfy several constraints:

  • Be syntactically and semantically correct
  • Keep the existing business logic
  • Fix the vulnerability
  • Obey linting rules and code standards
  • Fix relevant unit tests
  • Fix relevant integration tests

Technical solutions

A technical solution might be based on a rule-based method (see https://www.pixee.ai/ and https://codemodder.io/). This is a very effective method for most basic cases. We can also think about using generative AI, either for generating part of the code fixes, or for generating rules for feeding the rule-based method.

This is a use case of GitLab Workflow

Generated-fix quality

The main technical challenge here is to generate high-quality fixes.
Ideally, most of the process will be automated and will require only a short review from a developer.
The quality of the generated fixes must be good enough so that the efforts invested in the review are significantly lower than the efforts that would have been required for implementing the fix manually.
Most of the time the MR should be good enough to be approved without any changes.

LLM-based fix generation

The following information should be given as context:

  • Kind of vulnerability
  • Relevant code snippets
  • Attack vector (source, sink and intermediate nodes from source to sink)
  • High-level guidance about the expected fix (sanitize input, disable XML external entities, ...)
  • Additional guidance (use relevant sanitation method from standard library, implement custom sanitation in separate method, avoid making additional changes not required for fixing the vulnerability)

Expected user experience

Fixing new vulnerabilities on MR

  1. SAST is running automatically on MR
  2. SAST finds a vulnerability
  3. Auto-remediating builds a fix the for SAST finding
  4. MR containing the fix is sent for review to the MR assignees
  5. If necessary, the reviewers potentially interact with the AI agent (Duo Workflow) in the MR
  6. The fix is approved and merged to the original MR
  7. SAST is running again, confirming that the vulnerability has been fixed

Fixing existing vulnerabilities on main branch

  1. SAST is running automatically on main
  2. SAST finds vulnerabilities
  3. Auto-remediating selects regularly a few SAST finding
  4. MR containing the fix for those findings is sent for review to relevant reviewers
  5. If necessary, the reviewers potentially interact with the AI agent (Duo Workflow) in the MR
  6. The fix is approved and merged
  7. SAST is running again, confirming that the vulnerability has been fixed
Edited by Meir Benayoun