Skip to content
  • Jeff King's avatar
    patch-ids: refuse to compute patch-id for merge commit · 7c810407
    Jeff King authored and Junio C Hamano's avatar Junio C Hamano committed
    
    
    The patch-id code which powers "log --cherry-pick" doesn't
    look at whether each commit is a merge or not. It just feeds
    the commit's first parent to the diff, and ignores any
    additional parents.
    
    In theory, this might be useful if you wanted to find
    equivalence between, say, a merge commit and a squash-merge
    that does the same thing.  But it also promotes a false
    equivalence between distinct merges. For example, every
    "merge -s ours" would look identical to an empty commit
    (which is true in a sense, but presumably there was a value
    in merging in the discarded history). Since patch-ids are
    meant for throwing away duplicates, we should err on the
    side of _not_ matching such merges.
    
    Moreover, we may spend a lot of extra time computing these
    merge diffs. In the case that inspired this patch, a "git
    format-patch --cherry-pick" dropped from over 3 minutes to
    less than 3 seconds.
    
    This seems pretty drastic, but is easily explained. The
    command was invoked by a "git rebase" of an older topic
    branch; there had been tens of thousands of commits on the
    upstream branch in the meantime. In addition, this project
    used a topic-branch workflow with occasional "back-merges"
    from "master" to each topic (to resolve conflicts on the
    topics rather than in the merge commits). So there were not
    only extra merges, but the diffs for these back-merges were
    generally quite large (because they represented _everything_
    that had been merged to master since the topic branched).
    
    This patch treats a merge fed to commit_patch_id() or
    add_commit_patch_id() as an error, and a lookup for such a
    merge via has_commit_patch_id() will always return NULL.
    An earlier version of the patch tried to distinguish between
    "error" and "patch id for merges not defined", but that
    becomes unnecessarily complicated. The only callers are:
    
      1. revision traversals which want to do --cherry-pick;
         they call add_commit_patch_id(), but do not care if it
         fails. They only want to add what we can, look it up
         later with has_commit_patch_id(), and err on the side
         of not-matching.
    
      2. format-patch --base, which calls commit_patch_id().
         This _does_ notice errors, but should never feed a
         merge in the first place (and if it were to do so
         accidentally, then this patch is a strict improvement;
         we notice the bug rather than generating a bogus
         patch-id).
    
    So in both cases, this does the right thing.
    
    Helped-by: default avatarJohannes Schindelin <Johannes.Schindelin@gmx.de>
    Signed-off-by: default avatarJeff King <peff@peff.net>
    Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    7c810407