Skip to content
  • Vicent Marti's avatar
    rev-list: add bitmap mode to speed up object lists · aa32939f
    Vicent Marti authored and Junio C Hamano's avatar Junio C Hamano committed
    
    
    The bitmap reachability index used to speed up the counting objects
    phase during `pack-objects` can also be used to optimize a normal
    rev-list if the only thing required are the SHA1s of the objects during
    the list (i.e., not the path names at which trees and blobs were found).
    
    Calling `git rev-list --objects --use-bitmap-index [committish]` will
    perform an object iteration based on a bitmap result instead of actually
    walking the object graph.
    
    These are some example timings for `torvalds/linux` (warm cache,
    best-of-five):
    
        $ time git rev-list --objects master > /dev/null
    
        real    0m34.191s
        user    0m33.904s
        sys     0m0.268s
    
        $ time git rev-list --objects --use-bitmap-index master > /dev/null
    
        real    0m1.041s
        user    0m0.976s
        sys     0m0.064s
    
    Likewise, using `git rev-list --count --use-bitmap-index` will speed up
    the counting operation by building the resulting bitmap and performing a
    fast popcount (number of bits set on the bitmap) on the result.
    
    Here are some sample timings of different ways to count commits in
    `torvalds/linux`:
    
        $ time git rev-list master | wc -l
            399882
    
            real    0m6.524s
            user    0m6.060s
            sys     0m3.284s
    
        $ time git rev-list --count master
            399882
    
            real    0m4.318s
            user    0m4.236s
            sys     0m0.076s
    
        $ time git rev-list --use-bitmap-index --count master
            399882
    
            real    0m0.217s
            user    0m0.176s
            sys     0m0.040s
    
    This also respects negative refs, so you can use it to count
    a slice of history:
    
            $ time git rev-list --count v3.0..master
            144843
    
            real    0m1.971s
            user    0m1.932s
            sys     0m0.036s
    
            $ time git rev-list --use-bitmap-index --count v3.0..master
            real    0m0.280s
            user    0m0.220s
            sys     0m0.056s
    
    Though note that the closer the endpoints, the less it helps. In the
    traversal case, we have fewer commits to cross, so we take less time.
    But the bitmap time is dominated by generating the pack revindex, which
    is constant with respect to the refs given.
    
    Note that you cannot yet get a fast --left-right count of a symmetric
    difference (e.g., "--count --left-right master...topic"). The slow part
    of that walk actually happens during the merge-base determination when
    we parse "master...topic". Even though a count does not actually need to
    know the real merge base (it only needs to take the symmetric difference
    of the bitmaps), the revision code would require some refactoring to
    handle this case.
    
    Additionally, a `--test-bitmap` flag has been added that will perform
    the same rev-list manually (i.e. using a normal revwalk) and using
    bitmaps, and verify that the results are the same. This can be used to
    exercise the bitmap code, and also to verify that the contents of the
    .bitmap file are sane.
    
    Signed-off-by: default avatarVicent Marti <tanoku@gmail.com>
    Signed-off-by: default avatarJeff King <peff@peff.net>
    Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    aa32939f