Git Single File History Should Include Name Status
Problem to solve
When viewing the history of a single file, the Git File History view does the equivalent of a git log --follow --format=format:%H
to return a list of commits for that file - which in turn is processed as a set of raw commits by git cat-file
The by-product of doing the --follow
, which was added back to GitLab in this MR in order to address this issue is that git tries hard to follow moves/copies by looking for similar files in previous trees.
When viewing the Git File History in the UI, it's not easy to tell from the displayed commits when a file was moved/copied.
Worse is when the missing move/copy context results in confusion from observing the file history for a given file when git log --follow
pulls in the the file history of unrelated, but similar (a diff
similarity index > 050
) files and that history indistinguishable from the history of the requested file.
Intended users
TBD
Further details
Example workflow
This would result in a confusing history returned by git:
- Create a new example file in one directory, commit
- Create a second new example file in a second directory, with at least 50% similar contents to the first, commit
- Create a third new example file in a third directory, with at least 50% similar contents to the second, commit
Looking at the file history for the third file will pull in the histories for the first and second file.
Example Repository:
https://gitlab.com/jayo/git-history-test
Example in the GitLab Git File History view:
Reproduction steps with bash/git:
$ mkdir system_one
$ echo -e "Config1\n-------\nDirective1\nDirective2\nDirective3" > system_one/config.txt
$ git add system_one/
$ git commit -m "Added configuration file one"
$ mkdir system_two
$ echo -e "Config2\n-------\nDirective1\nDirective2\nDirective3\nDirective4\nDirective5" > system_two/config.txt
$ git add system_two/
$ git commit -m "Added configuration file two"
$ git log --follow --name-status system_two/config.txt
$ mkdir system_three
$ echo -e "Config3\n-------\nDirectiveA\nDirectiveB\nDirective3\nDirective4\nDirective5" > system_three/config.txt
$ git add system_three
$ git commit -m "Added configuration file three"
git log --follow --format=format:%H
Example $ git log --follow --format=format:%H system_three/config.txt
17af4be46e785b8df32de0a40a62cdf0e032c5fa
294ce9dd1ef6e9dfb776585b3b69874eb4b2897a
8e321156c4b30268aac9d8ff456b3983e18e9e56
--name-status
metadata
Example with additional Showing what git thought of as a copy and the similarity index:
git log --follow --format=format:%H --name-status system_three/config.txt
17af4be46e785b8df32de0a40a62cdf0e032c5fa
C057 system_two/config.txt system_three/config.txt
294ce9dd1ef6e9dfb776585b3b69874eb4b2897a
C057 system_one/config.txt system_two/config.txt
8e321156c4b30268aac9d8ff456b3983e18e9e56
A system_one/config.txt
Proposal
If, when we explicitly call --follow
when looking at a single path, we could also include --name-status
in the git log, and hold onto that metadata for use by the UI when parse the raw commits with git cat-file
- it could improve the Git File History view in actual move/copy scenarios, and provide additional context for unrelated histories where git believes a move/copy has occurred along the way.