Improve `git replay` so it can replay merges
When git replay was upstream based on earlier work from Elijah Newren, a number of features in Elijah's early work weren't upstreamed as they weren't needed soon. One of these left out features was replaying merge commits.
Customer sometimes ask us about replaying merges, so it would be useful to have it.
At the same time it might be a good idea to also upstream other features that were left out if they don't require too much work and seem interesting.
Elijah's early work is at:
https://github.com/newren/git/commits/replay
It contains especially the following:
- some design notes,
- a document looking at how the various "rebasing merges" proposals work in 6 simple cases, so it's possible to interpolate how they'd behave in more complicated cases,
- a TODO looking at future work.
The design notes contain the following in the "Intro via examples" section:
* Scaling up -- replaying merges
Unlike rebase, if the range contains merges, the merges are not
dropped -- it is instead replayed. This is not a simple re-merge like
rebase's --rebase-merges, because it carries over any "fixups" made in
the original merge relative to an automatic merge. See below under
"Preserving topology, replaying merges".
And yeah they have a section called "Preserving topology, replaying merges" which contains the following:
======================================================================
Preserving topology, replaying merges
======================================================================
`git replay` will preserve relative topology by replaying merges.
Further, much as regular single-parent commits' changes are replayed,
we also want to replay the manual changes users include in merges.
Essentially, this means that after merging the rebased parents, we
need to amend that merge by applying the diff from `git show
--remerge-diff $oldmerge`. Or, equivalently, doing a three way merge
between:
* R: automatic remerge of $oldmerge accepting all conflicts
* O: $oldmerge
* N: (new) merge of rebased parents
A couple things to note about this three-way merge:
* `git diff R O` roughly equals `git show --remerge-diff $oldmerge`
* N is what current `git rebase --rebase-merge` uses, so we have a
superset of the information available to current `git rebase`.
This was discussed previously on list at [4], using the names pre-M,
M, and N instead of R, O, and N. After digging further, I think we
can do better on conflict resolution and avoiding nested conflict
markers...
Handling conflicts:
* When conflict markers are appropriate
* When creating R, we should "lie" about the hashes & commit summary
so that the conflict markers exactly match those that would be
used for N. Because doing so allows us to detect when N has the
same textual conflicts as R.
* Consider using XDL_MERGE_FAVOR_BASE[5] to avoid nested conflicts from
recursive merges.
* We need a special xdiff merging mode for three-way merging R, O, and N:
* Note that O does not have conflict hunks; it was the user created
merge, not an "automatic" merge. (Okay, user may be stupid and
commited with conflict markers, but I don't think we need to pay
attention to that, and users get what they deserve if they did that.)
* This special merging mode should never split a conflict hunk from
either R or N; it must operate on the entire hunk.
* If neither of R or N have conflict markers, then merging proceeds
as normal.
* If R & N have identical conflict hunks, then we can take the
version of text from O and the result is clean.
* If R has conflict hunks, but N does not:
* if merge.conflictStyle="merge", who cares, just two-way merge O & N
* if merge.conflictStyle="diff3", extend the conflict marker length by 1
for R, then three-way merge R, O, & N. You get a nested conflict.
* If N has conflict hunks that do not match R (R may or may not have
conflict hunks), then:
* We ignore both R & O and use the version from N as the resolution
* We do not mark it as resolved, though; we consider it to still be
conflicted.
* We make sure when the replay stops that the user is recommended to
run `git show --remerge-diff $oldmerge` for potential hints at
resolving the conflict. (Helpful since that command shows the diff
of R & O, and we threw away info from R & O here.)
* When conflict markers are not appropriate (binary files, mode
changes, modify/delete, etc., etc.):
* If both R & N have conflicts for a given path, and the three modes
& hashes from R match the three from N, then we can the version of
that path from O as the resolution.
* If the three modes & hashes do not match between R & N:
* Use N as the resolution
* Do not mark the file as resolved, even if N had no conflicts
* Make sure the user is recommended to run `git show --remerge-diff
$oldmerge` for potential hints at resolving the conflict.
[4] https://lore.kernel.org/git/CABPp-BHp+d62dCyAaJfh1cZ8xVpGyb97mZryd02aCOX=Qn=Ltw@mail.gmail.com/
[5] https://lore.kernel.org/git/CABPp-BF2KnktDTtTfp=hRS36HN-xYC8=P1eYcqaBhJvAJcTCAw@mail.gmail.com/
The "Current status" section is also very interesting. It contains:
* replaying merges works IFF
* an automatic remerge of the original merge has no conflicts AND
* auto-merging the rebased parents has no conflicts AND
* three-way merging those two merges with the original merge has no
conflicts