Unify all Diff File responses from merge request APIs
As the ~"group::code review" ~frontend team moves to more and more complex code that joins Diffs with other pieces of data like discussion notes, it's more important than ever that what's returned as a Diff File matches the response of other endpoints that also contain Diff Files.
The most pressing reason for this is **the ability to match two files together with no ambiguity.**
Here are the three most common Diff Files are returned by three different API endpoints:
- `[u]` = user or group
- `[p]` = project
<table>
<thead>
<tr>
<th><code>/[u]/[p]/-/merge_requests/1/diffs_metadata.json</code></th>
<th><code>/[u]/[p]/-/merge_requests/1/diffs_batch.json</code></th>
<th><code>/[u]/[p]/-/merge_requests/1/discussions.json</code></th>
</tr>
</thead>
<tbody>
<tr>
<td>
<pre lang="json"><code>
{
...
"diff_files": [
{
"conflict_type": null,
"added_lines": 360,
"removed_lines": 0,
"new_path": "README.md",
"old_path": "README.md",
"new_file": false,
"deleted_file": false,
"submodule": false,
"file_identifier_hash": "14e688a849c8209d6d1d93c29f305097ad8df999",
"file_hash": "8ec9a00bfd09b3190ac6b22251dbb1aa95a0579d"
},
...
]
}
</code></pre>
</td>
<td>
<pre lang="json"><code>
{
"diff_files": [
{
"content_sha": "d1203ea86afb98b2bf1b97c520bf36299623de66",
"submodule": false,
"submodule_link": null,
"submodule_tree_url": null,
"submodule_compare": null,
"edit_path": "/[u]/[p]/-/edit/[truncated]",
"ide_edit_path": "/-/ide/project/[u]/[p]/merge_requests/1",
"old_path_html": "README.md",
"new_path_html": "README.md",
"blob": {
"id": "a8a1c6055c89ab8655e13159d9a021f00552ade9",
"path": "README.md",
"name": "README.md",
"mode": "100644",
"readable_text": true,
"icon": "doc-text"
},
"can_modify_blob": true,
"file_identifier_hash": "14e688a849c8209d6d1d93c29f305097ad8df999",
"file_hash": "8ec9a00bfd09b3190ac6b22251dbb1aa95a0579d",
"file_path": "README.md",
"old_path": "README.md",
"new_path": "README.md",
"new_file": false,
"renamed_file": false,
"deleted_file": false,
"diff_refs": {
"base_sha": "a479fe3783cc44699a9d903e9a1a04350b262115",
"start_sha": "a479fe3783cc44699a9d903e9a1a04350b262115",
"head_sha": "d1203ea86afb98b2bf1b97c520bf36299623de66"
},
"stored_externally": null,
"external_storage": null,
"mode_changed": false,
"a_mode": "100644",
"b_mode": "100644",
"viewer": {
"name": "text",
"error": null,
"error_message": null,
"collapsed": null
},
"alternate_viewer": null,
"old_size": 3880,
"new_size": 7119,
"conflict_type": null,
"added_lines": 360,
"removed_lines": 0,
"load_collapsed_diff_url": "/[u]/[p]/-/merge_requests/[truncated]",
"view_path": "/[u]/[p]/-/blob/[truncated]",
"replaced_view_path": null,
"context_lines_path": "/[u]/[p]/-/blob/[truncated]",
"highlighted_diff_lines": [ ... ],
"is_fully_expanded": false,
"code_navigation_path": null
},
...
],
"pagination": {
"total_pages": 2
}
}
</code></pre>
</td>
<td>
<pre lang="json"><code>
[
{
...
"diff_file": {
"content_sha": "dc40671f8db2d4fcbd39d767da02b83661d7bb95",
"submodule": false,
"submodule_link": null,
"submodule_tree_url": null,
"submodule_compare": null,
"old_path_html": "README.md",
"new_path_html": "README.md",
"blob": {
"id": "a8a1c6055c89ab8655e13159d9a021f00552ade9",
"path": "README.md",
"name": "README.md",
"mode": "100644",
"readable_text": true,
"icon": "doc-text"
},
"can_modify_blob": false,
"file_identifier_hash": "14e688a849c8209d6d1d93c29f305097ad8df999",
"file_hash": "8ec9a00bfd09b3190ac6b22251dbb1aa95a0579d",
"file_path": "README.md",
"old_path": "README.md",
"new_path": "README.md",
"new_file": false,
"renamed_file": false,
"deleted_file": false,
"diff_refs": {
"base_sha": "4e9bb33dcb6634eff208f6f8d838ff7498da04ce",
"start_sha": "4e9bb33dcb6634eff208f6f8d838ff7498da04ce",
"head_sha": "dc40671f8db2d4fcbd39d767da02b83661d7bb95"
},
"stored_externally": null,
"external_storage": null,
"mode_changed": false,
"a_mode": "100644",
"b_mode": "100644",
"viewer": {
"name": "text",
"error": null,
"error_message": null,
"collapsed": false
},
"alternate_viewer": null,
"old_size": 3880,
"new_size": 7119
}
}
]
</code></pre>
</td>
</tr>
</tbody>
</table>
### Problem
As you can see, there are noticeable variations between the files.
In the case of the metadata, this is intentional. However, there isn't enough contextual information shared across all of the file representations to unambiguously match them.
#### Matching Scenarios
To disambiguate and match files across projects, merge requests, and merge request versions while accounting for changes (or lack thereof) across MR versions, the FE needs:
- the group or username
- the project
- the merge request ID
- the git blob ID
- the file path
- the file's mode
- the diff action (added/deleted/modified)
#### Missing data
The three variations of the Diff File are missing:
<table>
<thead>
<tr>
<th><code>/[u]/[p]/-/merge_requests/1/diffs_metadata.json</code></th>
<th><code>/[u]/[p]/-/merge_requests/1/diffs_batch.json</code></th>
<th><code>/[u]/[p]/-/merge_requests/1/discussions.json</code></th>
</tr>
</thead>
<tbody>
<tr>
<td>
<pre lang="json"><code>
{
"userOrGroup": "[u]",
"project": "[p]",
"mergeRequest": 1,
"lastChangedInMergeRequestVersion": 5,
"gitId": "abcdef1234567890",
"path": "/the/full/file/path.file",
"mode": "0755" // OR "u=rwx,g=rw,o=rw"
"action": "modified"
}
</code></pre>
</td>
<td>
<pre lang="json"><code>
{
"userOrGroup": "[u]",
"project": "[p]",
"mergeRequest": 1,
"lastChangedInMergeRequestVersion": 5,
"action": "modified"
}
</code></pre>
</td>
<td>
<pre lang="json"><code>
{
"userOrGroup": "[u]",
"project": "[p]",
"mergeRequest": 1,
"lastChangedInMergeRequestVersion": 5,
"action": "modified"
}
</code></pre>
</td>
</tr>
</tbody>
</table>
### Solution
I propose a standard base shared among **all** representations of a Diff File:
- the group or username
- the project
- the merge request ID
- the git blob ID
- the file path
- the file's mode
- the diff action (added/deleted/modified)
#### Other useful data
Sometimes, it's helpful to know the last MR version in which this file was changed.
We can infer this from the git blob ID which _shouldn't_ change if the file hasn't been modified, but it would be convenient to have a definitive source of this information.
We could include:
- the MR version in which this file last changed
#### Existing Shortcuts
`file_identifier_hash` includes the file path and the diff action, which could be re-used in a pinch, but having the values available as individual properties would be preferable.
#### Example base Diff File shared among all representations
```json
{
"userOrGroup": "gitlab-org",
"project": "gitlab",
"mergeRequest": 123456,
"lastChangedInMergeRequestVersion": 5,
"gitId": "abcdef1234567890",
"path": "/the/full/file/path.file",
"mode": "0755" // OR "u=rwx,g=rw,o=rw"
"action": "modified"
}
```
### Tasks
- [ ] Add `userOrGroup` to `diffs_metadata.json` Diff Files
- [ ] Add `userOrGroup` to `diffs_batch.json` Diff Files
- [ ] Add `userOrGroup` to `discussion.json` Diff Files
- [ ] Add `project` to `diffs_metadata.json` Diff Files
- [ ] Add `project` to `diffs_batch.json` Diff Files
- [ ] Add `project` to `discussion.json` Diff Files
- [ ] Add `mergeRequest` to `diffs_metadata.json` Diff Files
- [ ] Add `mergeRequest` to `diffs_batch.json` Diff Files
- [ ] Add `mergeRequest` to `discussion.json` Diff Files
- [ ] Add `action` to `diffs_metadata.json` Diff Files
- [ ] Add `action` to `diffs_batch.json` Diff Files
- [ ] Add `action` to `discussion.json` Diff Files
- [ ] Add `gitId` to `diffs_metadata.json` Diff Files
- [ ] Add `path` to `diffs_metadata.json` Diff Files
- [ ] Add `mode` to `diffs_metadata.json` Diff Files
#### Optional Tasks
- [ ] Add `lastChangedInMergeRequestVersion` to `diffs_metadata.json` Diff Files
- [ ] Add `lastChangedInMergeRequestVersion` to `diffs_batch.json` Diff Files
- [ ] Add `lastChangedInMergeRequestVersion` to `discussion.json` Diff Files
issue