Skip to content
  • Vicent Marti's avatar
    ewah: compressed bitmap implementation · e1273106
    Vicent Marti authored and Junio C Hamano's avatar Junio C Hamano committed
    EWAH is a word-aligned compressed variant of a bitset (i.e. a data
    structure that acts as a 0-indexed boolean array for many entries).
    
    It uses a 64-bit run-length encoding (RLE) compression scheme,
    trading some compression for better processing speed.
    
    The goal of this word-aligned implementation is not to achieve
    the best compression, but rather to improve query processing time.
    As it stands right now, this EWAH implementation will always be more
    efficient storage-wise than its uncompressed alternative.
    
    EWAH arrays will be used as the on-disk format to store reachability
    bitmaps for all objects in a repository while keeping reasonable sizes,
    in the same way that JGit does.
    
    This EWAH implementation is a mostly straightforward port of the
    original `javaewah` library that JGit currently uses. The library is
    self-contained and has been embedded whole (4 files) inside the `ewah`
    folder to ease redistribution.
    
    The library is re-licensed under the GPLv2 with the permission of Daniel
    Lemire, the original author. The source code for the C version can
    be found on GitHub:
    
    	https://github.com/vmg/libewok
    
    The original Java implementation can also be found on GitHub:
    
    	https://github.com/lemire/javaewah
    
    
    
    [jc: stripped debug-only code per Peff's $gmane/239768]
    
    Signed-off-by: default avatarVicent Marti <tanoku@gmail.com>
    Signed-off-by: default avatarJeff King <peff@peff.net>
    Helped-by: default avatarRamsay Jones <ramsay@ramsay1.demon.co.uk>
    Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    e1273106