There are situations where a change in the source branch that is already present in the target branch is shown in the merge request diff. This is confusing because the change is not actually being introduced in to the target branch because it is already there.
For example, a fix is added directly to master, then cherry picked in to the master branch, the cherry picked change will be visible in the merge request. (jramsay-reproduce/ce-15140!1 (diffs))
GitLab should show the changes that will be merged in to the target branch.
MR diff/changes view shows commits/changes that have already been merged into target branch.
Steps to reproduce
For now I can't reproduce the issue consistently. It occurs randomly (and often, but not always) in our two main projects/repos. Haven't been able to reproduce it elsewhere yet.
The way our git process works is we have two long standing branches, nonprod and production. Production is marked as the default branch. We create new branches off/from production branch. Make the change in the new branch and submit two MRs, one to nonprod and one to production.
So clone -> production -> (create new branch) branch_example1
Then submit two MRs.
branch_example1 -> nonprod (the problem happens here)
branch_example1 -> production (the problem never happens here)
The bug is:
The diffs view (Changes) view for the nonprod MR is incorrect and differs from that of the production MR. It shows commits that aren't from the branch I created. These commits are always merge commits. The issue is that the diff view then shows changes/diffs for changes that aren't from the source branch. These "extra" changes have already been merged into the target branch. So the diff appears incorrect. The MR to the target production(default) branch is always correct/perfect. There are no extra merge commits and the diffs are
always correct, ie just the things I changed/just the difference between the source and target branch. It doesn't show these extra invalid changes that have already been merged.
Only when this bug occurs does the title for the MR for the target nonprod be set to the branch name (with the underscores stripped and the title capitalised.) This differs from the target production(default) MR, that
MR title is set to from the commit in the branch. This makes the MR overview screen confusing because one can't easily see the two MRs are the same, just to a different target.
Expected behaviour
Under the diffs/changes view on the MR, we would only see the diff/changes. (Currently it displays changes that are already present in the destination/target branch).
Secondly I'd expect the the MR title to be set consistently for both MRs with the same source branch (and same commits in the branch).
Possible fixes
No idea.
I'm not quite sure how can I debug this further.
We used this branching strategy on Stash before and never had this issue.
We also imported this repo and git history/data from Stash.
@JobV this is the issue we were chatting about in the mailing list.
Thanks for the input so far.
We keep our instance up to date and use omnibus.
I'll attempt to figure out some more info on this issue and update this issue accordingly.
I'll also try figure out a way to reproduce it, or at least in a different repo to where we currently have it.
@divansantana Thanks for the detailed report. I wonder if you have a scenario like this:
Your production branch contains a merge commit (3) from some other branch that the non-production does not have.
If you create a MR for F to production, you just get the diff between 4 and F. That's straightforward.
But if you create a MR for F to the non-production branch, I think it's possible GitLab may include the contents of 3 in the diff, even if the non-production branch had merged the changes in a subsequent commit after X. Is this what you are seeing?
@stanhu That's exactly it. Thanks for the quick response and feedback.
I think it's possible GitLab may include the contents of 3 in the diff, even if the non-production branch had merged the changes in a subsequent commit after X
I can confirm this. This is what we are seeing.
Is this is GitLab bug? Or something we can change on our side.
We never experienced this when using Stash and doing the same branching strategy.
@divansantana It's most likely a bug with GitLab. I suspect the way merge requests gather which commits to include in the diff may be the issue here. If you have a moment to create a repo that reproduces the issue, that would be a big help. Thanks.
@stanhu I managed to reproduce the bug reliably, inline with your above description and diagram.
I've created the public repo here and given you full access.
I did these exact steps:
Created the above repository.
cd /tmpgit clone https://gitlab.com/divansantana/15140_gitlab_bug.gitcd 15140_gitlab_bugecho"First commit. Creating a repo to help reproduction the bug that is reported at this [issue](https://gitlab.com/gitlab-org/gitlab-ce/issues/15140)"> README.mdgit add README.mdgit commit -a-m'first commit'git branch -m master production # to rename branch in line with above diagramecho'this is some test file. Commit 2 in production branch directly.'> testfilegit add testfilegit commit -a-m'second commit, in prod branch'git push origin production:productiongit checkout -b non-productionecho'An edit, that entered production and non-production branch in a different way. As per gitlab issue description.'> testfile2git add testfile2git commit -a-m'commit entered in a diff way to each target'git show # note the commit numbergit push origin non-production:non-productiongit checkout productiongit branch --unset-upstreamgit branch --set-upstream-to=origin/production productiongit checkout -b branch2be_merged_with_prod_onlygit cherry-pick e9ec77b9b5bb0491c1725abdae631b00a582c9a5 # so pick identical commit that was made in non-production directlygit push origin branch2be_merged_with_prod_only:branch2be_merged_with_prod_only
Submit a MR to target production only (Since the content is already in non-production branch).
Then merge it.
Take note, that at this point, as per below, non-production and production branches have identical contents (excluding git commits).
cd /tmp/15140_gitlab_buggit checkout productiongit pull origin productiongit diff non-production
Now follow the "correct" process to submit a feature MR to both target branches.
cd /tmp/15140_gitlab_buggit checkout productiongit pull origin productiongit checkout -b feature_branch_to_go_to_both_targetsecho'This file, entered the correct way/same way to both target branches.'> testfile3git add testfile3git commit -a-m'commit entered same way to both target branches'git push origin feature_branch_to_go_to_both_targets:feature_branch_to_go_to_both_targets
On GitLab web gui, submit a MR from feature_branch_to_go_to_both_targets -> production. Can be seen here
On GitLab web gui, submit a MR from feature_branch_to_go_to_both_targets -> non-production. Can be seen here
a. Take note that the MR titles for the two MRs differ, even though the data content is identical, which is confusing.
@stanhu I hope this helps. Hope it can be resolved without too much effort. It unfortunately effects our organisation quite a bit so we'd be ecstatic to see it resolved.
We use git diff from...to for all comparisons, which behaves like this per the docs:
git diff [--options] <commit>...<commit> [--] [<path>…] This form is to view the changes on the branch containing and up to the second <commit>, starting at a common ancestor of both <commit>. "git diff A...B" is equivalent to "git diff $(git-merge-base A B) B". You can omit any one of <commit>, which has the same effect as using HEAD instead.
Where git-merge-base non-production feature_branch_to_go_to_both_targets is the newest commit that both branches have in common, which goes by commit identity, not commit contents, so cherry-pick will be "ignored". This commit is divansantana/15140_gitlab_bug@f414719f, and all commits after that are part of the diff, which is exactly what we're seeing.
The reason why we use git diff from...to instead of git diff from to, is that the latter with include all differences between the heads of the two branches, which means that any changes that were made to the target branch (from) after the source branch (to) was created, will show up as deletions in the diff.
In the majority of cases, git diff from...to is exactly what we want, but since it compares using the merge base, not the actual commit content, it does not handle cherry-picks correctly.
@divansantana I can't think of a way to get the behavior you're looking for here, unfortunately.
@DouweM Thanks for the detailed feedback and reply.
My understanding of the expected behaviour (and our greater team) was to show the actual diff based on the contents of the data.
Since when we used this branching strategy with Atlassian Stash VCS we never had this issue.
I've tested and tried a few more things.
If I repeat the above steps, however instead of doing a git cherry pick, I simply recreate the same file with same contents and a different git commit message, I still get the issue.
As one can see in this other example gitlab project.
Then, I was wondering what the results would be if I can recreate this issue on GitHub and Atlassian Bitbucket (not doing the cherry-pick method).
Testing here on GitHub gets the exact same results as GitLab. :-( (unfortunately for us).
Testing here on bitbucketdoesn't have the issue. The diff is as we expect. But the title of the MRs is the same confusing and different one.
Interesting that Atlassians two VCS systems, Stash and Bitbucket behave differently to GitLab and GitHub.
Clearly our branching strategy is unusual. Perhaps I'll look into other common branching strategies for puppet, git and infrastructure as code.
I'm not sure if this is a bug or not. Based on chats with many people here (that aren't git experts) they expected to see similar behaviour to what I was expecting.
@DouweM , so based on above, are your thoughts, this is something that can't get fixed?
Not easily fixed?
I'm not quite sure how to change our branching model so this isn't a problem for us, for some reason it is.
We typically do this branching mode for infrastructure as code. As described here.
Otherwise, I'll see what else I could do. Would be great if this could somehow be improved in GitLab.
@divansantana The pull request diff algorithm used by Bitbucket and Stash is really good, and described in detail here: https://developer.atlassian.com/blog/2015/01/a-better-pull-request/, including comparisons to the GitHub and GitLab triple-dot method. The merge request diff algorithm in GitLab is something we would like to improve, and we will likely take inspiration from the way Bitbucket does it, but it is very much non-trivial, and not on our roadmap for the near future.
The branching model described by @divansantana is exactly the same as the one we are using. I'll admit that it doesn't seem to be widely used (as yet), but it solves a lot of process issues for us. Still, the merge request should show the actual code diffs, not the commit diffs for the sake of improving the readability of MRs. We've learned that it really is just a readability issue; the nonprod history still shows as expected, and we know which changes we can ignore since we know that nothing in production is not in non-prod. It's a bit of annoyance that we look forward to seeing fixed, but I guess we can live without for now.
This affects some of our workflows too - I see this also on the 'branch compare' screen. I have a question - at least just for comparing branches, could there not be an either-or button in the UI for the comparison strategy? Getting the real differences in code displayed for you would take away the need for using meld... if there are people working on different things on different branches, it's sometimes useful to see - OK, what is the real difference between the two branches now? (even if the histories are disjoint - ultimately what you care about more is the real state of the code more than it's history, especially when you're hunting down a problem)
Since many people may still want to see 'changes since last common commit' as it is displayed currently, an option to somehow get the behaviour @divansantana, @cubikca and I expected would be nice. There is a button bar in the UI already for various diff-related options (see attached image), so perhaps there could be a button for that too?
I think allowing that for compare views makes a lot of sense. Having the button change between ... and .. would be accurate, and elegant, but maybe a little too cryptic 🙂
I don't think we'd want to do this for MRs - at least not initially - because it would become very expensive very quickly (as I outlined here: https://gitlab.com/gitlab-org/gitlab-ce/issues/1048#note_21306958). I'm also wary of having MRs show different things to the commits view, because some people like to review commit-by-commit, but then this would already allow MRs to show different results to the commits view 🤷
@collen for the customer's case in the ticket, they could cherry-pick the commit onto their master branch instead. It might work for them, or it might not, but this is supported in the UI 🙂
@DouweM yeah, that's basically that. .. and ... would be pretty elegant indeed ;)
But if we're looking for a name, the two options should probably be named "absolute" (two dots) and "since common ancestor" (three dots) or something similar.
The reason why we use git diff from...to instead of git diff from to, is that the latter with include all differences between the heads of the two branches, which means that any changes that were made to the target branch (from) after the source branch (to) was created, will show up as deletions in the diff. #15140 (comment 11709832)
Funnily enough, this is the exact reason why changes introduced by another already merged branch are yet again shown in green as new changes or newly created files. This is counterintuitive as it gives absolutely unexpected diffs, because, what's worse, other tools you normally use on desktop show from to diffs.
•|\| • - common commit 1• || • - common commit 2• |\| | •| • | `feature a`|/ /• || • `feature b` <-- from (should not include changes from common commits)|• `main` <-- to
Same happens when both features are instead squashed and rebased onto main. Now the two resulting separate commits both implement same changes which again shows up in both merge requests, even after first has already been merged.
Every tool I know shows all changes between two HEADs. SourceTree, Git Kraken, Git Graph, GitLens, GitHub...
We moved to GitLab from Bitbucket because of the great CI, but this is hurting us daily and keeps us from moving over fully. It's hurts reviews, showing changes that are not actually part of the diff.
This problem is especially annoying, if you "extract" a commit c from the feature branch F and merge it separately into production (for instance because it's a "good" or "urgent" fix).
Lateron, when merging F into production, the diff of c is still shown and will take time to review, although it is not a change that will actually be made.
Would it be an option to show a hint like "merging target_branch into your feature_branch would reduce your diff by xxx lines", if such a situation occurs, so that the user is aware of it?
My team is experiencing this problem too. This has been pretty frustrating for us.
UPDATE: it turns out that for us the issue was due to our branching strategy, after changing it we stopped experiencing this issue.
what we had was this... a develop, release and master branch. develop would be pushed to release and "frozen" then eventually pushed to master. any changes made to master would then be propagated back to release first and then to develop. essentially fully forward and then fully back with the release branch being a middle ground long-lived branch.
I think @mikepurvis suggestion is a good one. We run into this issue using the maven-release-plugin. We rewrite the poms during release but do not push the rewrite into master but only push the tag with the changed version.
I've also hit this recently and found it a bit confusing at first. What I wanted to see was more "If I merged this into the target branch what would have changed". I'd split part of the feature into a second MR and wanted those changes to be excluded from the diff, but our approach of using the ... merge-base means they were still treated as new additions.
Perhaps we could have a "Preview merge" button/tab to create a dummy merge commit and show the diff between that and the target branch? This might be quite similar to the resolve conflicts button in that conflicts would potentially be included.
The problem I have with this is if I have, say, two branches b20180124_399 and b20180224_401 and I've picked changes into both of them, and I do a simple (for example)
It shows that the _399 branch contains a bunch of things prior to the cherry pick (e.g. things that wouldn't be in the branch should you just clone it). OTOH, if you do a
It shows no differences at all, which is what I'd expect, since the needed changes were cherry picked into both branches, and are in fact contained in both. The problem is that if I use the web interface to diff the branches, the dbill lines show up as different (when in fact they're not), and if I create a merge request for 399->401), that merge request contains updates to the dbill lines that aren't needed, since the _399 branch already contains the needed changes.
This makes me distrust the web merge request interface entirely. I have users using the web merge request interface daily, and if it's doing merge requests incorrectly, how can I ensure that my codebase isn't messed up without a hand review of all reviewed code? The "..." comparison seems broken, IMO.
ID/diffs shows all changes on the branch. To be precise it shows what changed (past tense) between the source branch and base.
"Changes" tab completely fails to show what will change after it's merged (future tense).
Additionally, if git merge master was performed on that branch, the "Changes" display is practically useless because it shows all those merged changes. Performing code review on such a branch is a no-go and it's causing trouble us (EEP customer: DreamHost) trouble.
Proposal
Show what /will/ change after it's merged.
Technically speaking, show a diff AS IF the source branch (e.g. feature branch) was rebased to target branch (e.g. master). That means the diff would be dynamic in nature - possibly changing whenever the target branch changes.
Implementation suggestion 1
Add target branch (head) (e.g. master (head) to the list here:
Whenever a user changes a view between target branch (head) and target branch (abcdef), remember the preference in user settings.
Make it the default if no preference was set. (Otherwise few developers will barely ever use this far superior diff view!)
Implementation suggestion 2
Rename "Changes" to "Changes on branch" where things will stay as normal.
Add a new tab named "Changes on merge" where the proposed diff will live.
Superiority to existing diff
Showing a diff in relation to target branch's HEAD is the most reasonable approach in the context of a Merge Request.
It's far more important to know what the target branch will look like once someone hits Merge.
EE candidate
This feature could be a good candidate for EES or maybe even EEP. In fact, we use MRs in conjunction with MR approvals, which is an EES feature. CC @bikebilly
Having the actual changes to be merged is a brilliant feature; IMHO it's about correctness rather than a special Enterprise feature. Currently what is shown in the standard view is actually misleading.
We are experiencing this behaviour very often, almost on every merge between branches. We are currently evaluating EE edition, but this is a HUGE obstacle (and i mean really HUGE) for us. Changing our process flow is not an option. Until fixed we stay at CE edition (no one would agree to pay for broken/misleading view of merge requests, going around requires immense amount of work and is prone to human mistakes).
Please fix this. I would expect that seeing actual changes beeing applied is the correct way.
We are seeing this too. Happened when merge from our default branch (development) into another branch (master). We do this at the end of each milestone to do a thorough review of the changes made to the branch during each milestone. This is very frustrating because we cannot use the changes presented in the merge request to perform the review. Instead we have to get copies of the two branches locally and use Beyond Compare to do the diffs. This also means that you cannot use the inline discussion threads that you can normally use from the merge request since the changes we are interested in are not present in the merge request diff.
We are preparing to merge from our development branch to our master branch in preparation for release. When I saw so many changes on the merge request, I thought we had messed up our master branch somehow. After panicking and wasting a bunch of time to figure out what happened, it appears that we can't trust the Changes tab on the merge request. That is...suboptimal.
We are changing our process to additionally perform a manual comparison of the merge request files. Please fix this bug / add this feature - it is causing us additional work outside of GitLab.
We are also looking into purchasing gitlab EE for our company. However, during our tests with with the community version, we have also run into the problem where the file on branch A and branch B could be exactly the same, however, the merge request from branch B to branch A would report a difference.
This problem is quite an issue because it would mean we cannot trust the merge request diff.
We are frequently running into this issue. It happens in two cases, 1. when resolving merge conflicts, and 2. when squashing all the commits in a branch to get them ready to merge to master. The workflow is as follows for each case.
Merge B into A. Later merge B into A again but now there are merge conflicts. Check out a new branch off of A, lets call it C. Merge B into C and fix the conflicts. Make a merge request for C into A. I would now expect this merge request to show only show actual changes, and in our case likely no changes. Instead we end up seeing changes from the past that were not involved in the conflicts. We frequently cannot trust the merge request diff because of this.
Squash all the commits from a branch that is ready to go to production into one commit. Make a merge request of that now squashed branch into a lower environment branch. I would expect this merge request to show me no changes since all of the changes are already in that lower environment branch.
+1. We have a MR active that somehow contains changes that are in the target branch. If I do the merge commit locally what's displayed for that is just a fraction of what the GitLab MR shows me. I'd personally expect the MR diff to show me what the merge commit would look like.
Having several long lasting feature branches and back propagating master into them is a normal workflow for shops who are just switching to a CI system and don't have an automated deployment/rollback procedure or the feature flags required to create MRs directly to master.
Here's my 2 cents after loosing few hours figuring out why the UI seemed to be out of sync with my repo:
The changes tab should obviously show the eventual changes resulted in the target branch by an eventual merge commit. When there are no conflicts.
If there are conflicts, it should show them as well, perhaps in a separate section.
If the merge request is already merged, it should show the actual changes brought into the target branch by the merge commit.
Our team switched from Bitbucket to GitLab EE a few years ago, but are now evaluating moving back due to the extremely long-standing inaction on this issue and overall lack of transparency surrounding it.
I also have to say that I don't really understand how come this is labeled as a ~"feature proposal" when it seems to me that it is a ~bug.
In my understanding, the changes tab is where discussions related with code review start. Code review only belongs to code which is actually going to be added to the target branch, rather than code which is already on the target branch. I don't really see a scenario where I would need to review code resulted from git diff from...to or git diff from..to.
@andrei-e-random Agreed. I'm really struggling to find any use case for a diff that shows changes that are already on the target branch and would in fact not happen if the MR were to be merged.
Thanks @Nowaker, et al. for the continued feedback – accurate, useful diffs are a critical to ~"code review" and we can definitely improve how we calculate diffs to make it clearer what changes will be merged, rather than what changes are in my branch.
@Nowaker your first proposal sounds like a good first iteration if someone would like to contribute it – it might be easier to create a new issue to track this first iteration and discuss scope and implementation details.
The solution that I'd like for this issue is a smarter default behavior like that mentioned by @DouweMhttps://gitlab.com/gitlab-org/gitlab-ce/issues/15140#note_11790711 rather than adding extra options. A user shouldn't need to know the difference between ... and .., nor what a merge-base is to be able to see the diff they need to do their job in GitLab.
@DouweM if we implemented something like https://gitlab.com/gitlab-org/gitlab-ce/issues/47110, I think this would allow us to show the actual merge before the merge actually happens, which I think is what is actually being requested in this feature proposal. Is this correct?
@jramsay Having the merged ref available for checkout would also hugely benefit CI, you could run builds/tests on the merge result ahead of time, and for instance auto-block the MR if the pipeline fails.
This is getting ridiculous. I have this every time we merge. Thinking about just switching somewhere else just to make this go away. Does anyone know if any other services like GitHub has this issue?
@opslock-bryan I can personally confirm that GitHub doesn't have this problem, and I heard good things about BitBucket's merge comparisons. GitHub also has a ref that you can use to build your merged branch before it's actually merged, and block the merge if it fails.
The interface currently presents the default comparison as being between the Latest version and master (or whatever the target branch is), but when opening the drop down it further shows master (base), which is contrary to the reasonable assumption that it is actually master (HEAD).
Default
Default expanded
@DouweM@nick.thomas what do you think of changing the default behavior to compare Latest version and master(HEAD), while leaving master (base) as an option in the drop down?
@jramsay Are you suggesting making the default view of the Changes branch effectively the result of git diff target source rather than git diff target...source, as it is today? That would mean that once target moves forward from the point where source was branched off, the diff will start displaying all of the new changes in target as deletions relative to source, as if these changes would actually be deleted if the MR is merged.
I think that would be at least as unexpected and confusing as the issue being discussed here, but we would end up seeing it on pretty much every single MR that wasn't created just moments ago. 😬
I agree that the issue being discussed here is one we should fix, but I don't think that change would actually improve the situation.
@jramsay@DouweM The only correct behaviour here is to do exactly what Bitbucket do— they even lay out exactly what the rationale is and described their implementation four years ago, in 2015. GitLab need to stop making excuses and just copy what's known to work:
Doing a .. diff is wrong because it includes irrelevant changes from the target branch. It also has to be update every time either the source or the target branch updates.
Doing a ... diff (the status quo) is better (changes less often so less compute burn) but does not actually compute what the user expects
Bitbucket's solution is what the user expects but computationally expensive -- it has to be recomputed very often.
I don't think there is a Git trick we can use here to make things easier. The hard part is to control the number of diffs we have to recompute. Also keep in mind syntax highlighting which may be as expensive, or more expensive, than git diff itself.
have a large .com customer affected by this as well.
Similar to some of the others, they're using feature branches which they merge into a review-like and release branches, which when compared may show differences even when the content is the same.
i must agree with @mikepurvis this is the solution i would be expecting, even if optional - those who wanted this type of merge request could have at least an option to switch this behaviour, even if computiationaly expensive (our gitlab on-premise instalation will be fine with lots of resources available)
@DouweM,
I am always quick to rebase my branch once the target changes. How would you otherwise know that your MR does not break stuff? Only being mergeable is even worse than saying my code compiled.
We do even only allow fast forward merges, most of the time we squash commits as well.
@jramsay Ah, that's interesting, I didn't make the connection with the merge ref.
It should be possible to display the merge contents in the Changes tab, but there may not always be a merge ref (when there are merge conflicts, for example, or when a fast-forward merge is required but not possible), and it will not always be up to date with the target branch because we won't be automatically updating it each time the target branch is updated, at least for now.
but there may not always be a merge ref (when there are merge conflicts, for example, or when a fast-forward merge is required but not possible)
FWIW In the Bitbucket approach, you include the merge conflicts in the diff. From a user perspective that makes some sense because if there is a conflict it's maybe nice to know what it is.
Regarding recomputing diffs, I don't know where all the time is spent (gitaly vs syntax highlighting / parsing etc.). From a Git point of view, we should be able to say "even though master advanced by several commits, none of those commits update files that are in this diff so we don't have to re-do the syntax highlighting". It's hard to say if that is an optimization worth making without knowing the relative costs of the git diff and the rest of the work to generate the MR diff. I can say that recomputing git diff --raw, which is usually fast, should be enough to say whether the diff that's shown to the user has changed.
I'm adding this to the increasingly long list of reasons to switch back to GitHub. It is pretty awful that a UI that is advertised as comparing branches shows differences between identical files.
I fully understand why the differences are displayed - it's just completely wrong in the context of comparing code and utterly confusing to anyone trying to understand what is actually changing in their codebase. You can make all the arguments you want against it, the truth (from the many, many comments on this issue) is that people expect to see a current diff between file contents, not a "technical diff" between commits.
GitHub, BitBucket et al have long used content diffs for branch-comparison and merge-requests. After almost 3 years of this issue, perhaps GitLab will finally catch up.
This issue speaks plenty of merge diffs but there is a related problem which I'm unable to find a reference to, despite skimming this long thread. Getting arbitrary diffs between any two commits outside of a merge request context. In this scenario I don't care at all for how the current state of each commit came to be, which previous commits are present etc. I just want to see the current actual difference between them.
I think what I mean is simply git diff <commit> <commit>. This is interesting in some cases with long-lived branches where some changes are merged back and forth ("alternative release branch").
Also, the ability to do something like the mentioned bitbucket algorithm for an arbitrary comparison without creating a merge request is useful.