Skip to content

Replace remediation MR creation logic for LLM diffs

Context

We have quite a few customer contacts about the Duo Vulnerability feature failing to create an MR, or creating an empty diff containing only a newline. We also have a decent amount of "Something went wrong" errors in the logs. We have also noticed these failures in our own QA testing.

After an investigation1 into the root cause, we realized that this line, which is used to gsub the llm fix into the source file, is too brittle. It sometimes ends up generating a blank git patch, rather than the remediation expected for the given llm response.

Impact

We aren't sure exactly how many reproductions of this bug there are, as the current error handling is reporting this as an AI response error.2 However, we know it affects at least 10% of all MRs attempted with Duo.

There are several issues with the current state of things, which compound into a very poor customer experience:

  • For those experiencing this issue, there is no workaround
  • The error surfaced to the customers says the issue is with the AI response, suggesting that trying again might work (when in fact it never will)
  • The unreliable experience does not instill confidence in the feature

This issue

During the bug investigation, the question of why we even construct a git patch in the first place was discussed in this thread. Using Files::MultiService allows us to modify the file directly in a branch, with no need to manually construct a valid git patch. 3

Risks

This is a pretty significant change that:

  1. touches a lot of code that was written under very tight deadlines and underwent many last minute requirement changes
  2. touches a lot of code that doesn't have a dedicated team actively iterating on it
  3. touches a lot of code that is currently manually QA'd in a very time consuming process

Risk mitigation

  1. The change includes a new suite of rspec coverage, comparing input and expected output. MR !187855 (merged)
  2. The change is behind a feature flag and with a controlled rollout.
  3. This change will go through manual and automated testing (covered in Testing below)

Implementation Plan

Testing

This is a significant code change and needs to be thoroughly tested. Both manually, and at scale using the CEF.

  1. Coordinate CEF (example)
  2. Smoke testing via UAT (example) to ensure defined quality standards are met (and surpassed).
  1. The investigation started in this issue and continued in this issue, where the fix was determined

  2. 2024-05-5 update: Since opening this issue, I have a few MRs merged or near merging that will start logging these errors properly as internal runtime errors, rather than a generic LLM error. After having that running in production a bit, we should be abler to have a better idea of the actual impact.

  3. Files::MultiService is the service backing this API

Edited by Subashis Chakraborty