• Jeff King's avatar
    prune: use bitmaps for reachability traversal · fde67d68
    Jeff King authored
    Pruning generally has to traverse the whole commit graph in order to
    see which objects are reachable. This is the exact problem that
    reachability bitmaps were meant to solve, so let's use them (if they're
    available, of course).
    
    Here are timings on git.git:
    
      Test                            HEAD^             HEAD
      ------------------------------------------------------------------------
      5304.6: prune with bitmaps      3.65(3.56+0.09)   1.01(0.92+0.08) -72.3%
    
    And on linux.git:
    
      Test                            HEAD^               HEAD
      --------------------------------------------------------------------------
      5304.6: prune with bitmaps      35.05(34.79+0.23)   3.00(2.78+0.21) -91.4%
    
    The tests show a pretty optimal case, as we'll have just repacked and
    should have pretty good coverage of all refs with our bitmaps. But
    that's actually pretty realistic: normally prune is run via "gc" right
    after repacking.
    
    A few notes on the implementation:
    
      - the change is actually in reachable.c, so it would improve
        reachability traversals by "reflog expire --stale-fix", as well.
        Those aren't performed regularly, though (a normal "git gc" doesn't
        use --stale-fix), so they're not really worth measuring. There's a
        low chance of regressing that caller, since the use of bitmaps is
        totally transparent from the caller's perspective.
    
      - The bitmap case could actually get away without creating a "struct
        object", and instead the caller could just look up each object id in
        the bitmap result. However, this would be a marginal improvement in
        runtime, and it would make the callers much more complicated. They'd
        have to handle both the bitmap and non-bitmap cases separately, and
        in the case of git-prune, we'd also have to tweak prune_shallow(),
        which relies on our SEEN flags.
    
      - Because we do create real object structs, we go through a few
        contortions to create ones of the right type. This isn't strictly
        necessary (lookup_unknown_object() would suffice), but it's more
        memory efficient to use the correct types, since we already know
        them.
    Signed-off-by: default avatarJeff King <peff@peff.net>
    Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    fde67d68
p5304-prune.sh 715 Bytes