Skip to content

refactor: fix diff algorithm

Tomas Vik requested to merge 342-fix-diff-algorithm into main

By stripping away all but unchanged lines we allowed edge cases when a file started or ended with changes. In this scenario, we wrongly added unchanged lines to the beginning or to the end of the file.

The bug explained

This logic is responsible for tracking line numbers between the old and the new version of the diff.

Thanks to a bug in GitLab API, when we create a new comment on an unchanged line, we need to submit both the line number on the old (oldLine) and on the new version (newLine).

We do this by looking at the diff hunks. E.g.

@@ -6,3 +6,2 @@
 unchanged line 6
-removed line 7
 unchanged line 8

If this is the only diff hunk, we know that lines before #6 are unchanged and we extrapolate them {oldLine: 5, newLine:5}, {oldLine: 4, newLine: 4} and so on till {oldLine: 1, newLine: 1}.

We also know that the removed line 7 caused the line numbers for unchanged lines to shift: {oldLine: 8, newLine: 7}.

Just from the diff, we don't know how large the file is, but when we ask: "Give me new line number for old line 20", we know that the file is at least 20 lines long and we extrapolate the diff: {oldLine: 9, newLine: 8},{oldLine: 10, newLine: 9}...{oldLine: 20, newLine: 19}.

The bug was caused by the extrapolation happening even if the file ended with a changed line

@@ -6,3 +6,2 @@
 unchanged line 6
 unchanged line 8
-removed line 9

When the diff ends with a removed line, we know that the file is only 9 lines large and when we ask "Give me new line number for old line 9" we should return undefined because old line 9 is no an unchanged line. The algorithm was stripping away the removed line and then extrapolating. In this scenario, it would return newLine 9.

Related to #342 (closed)

Edited by Tomas Vik

Merge request reports