Skip to content
  • Kevin Willford's avatar
    rebase: avoid computing unnecessary patch IDs · b3dfeebb
    Kevin Willford authored and Junio C Hamano's avatar Junio C Hamano committed
    
    
    The `rebase` family of Git commands avoid applying patches that were
    already integrated upstream. They do that by using the revision walking
    option that computes the patch IDs of the two sides of the rebase
    (local-only patches vs upstream-only ones) and skipping those local
    patches whose patch ID matches one of the upstream ones.
    
    In many cases, this causes unnecessary churn, as already the set of
    paths touched by a given commit would suffice to determine that an
    upstream patch has no local equivalent.
    
    This hurts performance in particular when there are a lot of upstream
    patches, and/or large ones.
    
    Therefore, let's introduce the concept of a "diff-header-only" patch ID,
    compare those first, and only evaluate the "full" patch ID lazily.
    
    Please note that in contrast to the "full" patch IDs, those
    "diff-header-only" patch IDs are prone to collide with one another, as
    adjacent commits frequently touch the very same files. Hence we now
    have to be careful to allow multiple hash entries with the same hash.
    We accomplish that by using the hashmap_add() function that does not even
    test for hash collisions.  This also allows us to evaluate the full patch ID
    lazily, i.e. only when we found commits with matching diff-header-only
    patch IDs.
    
    We add a performance test that demonstrates ~1-6% improvement.  In
    practice this will depend on various factors such as how many upstream
    changes and how big those changes are along with whether file system
    caches are cold or warm.  As Git's test suite has no way of catching
    performance regressions, we also add a regression test that verifies
    that the full patch ID computation is skipped when the diff-header-only
    computation suffices.
    
    Signed-off-by: default avatarKevin Willford <kcwillford@gmail.com>
    Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    b3dfeebb