Skip to content

Fix bug in #git_diff_prefix regex

Kerri Miller requested to merge kerrizor/fix-bug-in-git_diff_prefix-regex into master

What does this MR do and why?

While extracting regexes into feature-specific modules, I noticed that we had a divergence between use cases when it came to diff headers, so I isolated #git_diff_prefix and #generate_commit_message_git_diff_prefix to complete the refactor, suspecting that one or the other had a bug. Sure enough, #git_diff_prefix wasn't able to capture some git headers, while generate_commit_message_git_diff_prefix was more flexible.

This sort of bug is exactly the reason why refactoring the regexes and unifying them into a single location makes more sense than each individual class having their own declaration.

This bug in the regex was very much a P4; it wasn't causing any user-facing impact, nor was it substantially impacting the results, it was merely causing some minor inefficiency in AI API requests, as we strip the diff prefix string to save tokens/bandwidth used in the API call.


Example showing previous regex not performing as expected:

[1] pry(#<Llm::MergeRequests::SummarizeDiffService>)> diff.diff
=> "@@ -0,0 +1 @@\n+this branch is used to test Repository#merged_to_root_ref?\n"
[2] pry(#<Llm::MergeRequests::SummarizeDiffService>)> diff.diff.sub(Gitlab::Regex.git_diff_prefix, "")
=> "@@ -0,0 +1 @@\n+this branch is used to test Repository#merged_to_root_ref?\n"
[3] pry(#<Llm::MergeRequests::SummarizeDiffService>)> diff.diff.sub(Gitlab::Regex.generate_commit_message_git_diff_prefix, "")
=> "\n+this branch is used to test Repository#merged_to_root_ref?\n"
[4] pry(#<Llm::MergeRequests::SummarizeDiffService>)>
Edited by Kerri Miller

Merge request reports