1. 04 Oct, 2018 2 commits
    • René Scharfe's avatar
      oidset: uninline oidset_init() · 8c84ae65
      René Scharfe authored
      There is no need to inline oidset_init(), as it's typically only called
      twice in the lifetime of an oidset (once at the beginning and at the end
      by oidset_clear()) and kh_resize_* is quite big, so move its definition
      to oidset.c.  Document it while we're at it.
      Signed-off-by: default avatarRene Scharfe <l.s.r@web.de>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      8c84ae65
    • René Scharfe's avatar
      oidset: use khash · 8b2f8cbc
      René Scharfe authored
      Reimplement oidset using khash.h in order to reduce its memory footprint
      and make it faster.
      
      Performance of a command that mainly checks for duplicate objects using
      an oidset, with master and Clang 6.0.1:
      
        $ cmd="./git-cat-file --batch-all-objects --unordered --buffer --batch-check='%(objectname)'"
      
        $ /usr/bin/time $cmd >/dev/null
        0.22user 0.03system 0:00.25elapsed 99%CPU (0avgtext+0avgdata 48484maxresident)k
        0inputs+0outputs (0major+11204minor)pagefaults 0swaps
      
        $ hyperfine "$cmd"
        Benchmark #1: ./git-cat-file --batch-all-objects --unordered --buffer --batch-check='%(objectname)'
      
          Time (mean ± σ):     250.0 ms ±   6.0 ms    [User: 225.9 ms, System: 23.6 ms]
      
          Range (min … max):   242.0 ms … 261.1 ms
      
      And with this patch:
      
        $ /usr/bin/time $cmd >/dev/null
        0.14user 0.00system 0:00.15elapsed 100%CPU (0avgtext+0avgdata 41396maxresident)k
        0inputs+0outputs (0major+8318minor)pagefaults 0swaps
      
        $ hyperfine "$cmd"
        Benchmark #1: ./git-cat-file --batch-all-objects --unordered --buffer --batch-check='%(objectname)'
      
          Time (mean ± σ):     151.9 ms ±   4.9 ms    [User: 130.5 ms, System: 21.2 ms]
      
          Range (min … max):   148.2 ms … 170.4 ms
      Initial-patch-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarRene Scharfe <l.s.r@web.de>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      8b2f8cbc
  2. 22 Nov, 2017 1 commit
  3. 01 Oct, 2017 1 commit
    • Jonathan Tan's avatar
      oidmap: map with OID as key · 9e6fabde
      Jonathan Tan authored
      This is similar to using the hashmap in hashmap.c, but with an
      easier-to-use API. In particular, custom entry comparisons no longer
      need to be written, and lookups can be done without constructing a
      temporary entry structure.
      
      This is implemented as a thin wrapper over the hashmap API. In
      particular, this means that there is an additional 4-byte overhead due
      to the fact that the first 4 bytes of the hash is redundantly stored.
      For now, I'm taking the simpler approach, but if need be, we can
      reimplement oidmap without affecting the callers significantly.
      
      oidset has been updated to use oidmap.
      Signed-off-by: default avatarJonathan Tan <jonathantanmy@google.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      9e6fabde
  4. 30 Jun, 2017 1 commit
    • Stefan Beller's avatar
      hashmap.h: compare function has access to a data field · 7663cdc8
      Stefan Beller authored
      When using the hashmap a common need is to have access to caller provided
      data in the compare function. A couple of times we abuse the keydata field
      to pass in the data needed. This happens for example in patch-ids.c.
      
      This patch changes the function signature of the compare function
      to have one more void pointer available. The pointer given for each
      invocation of the compare function must be defined in the init function
      of the hashmap and is just passed through.
      
      Documentation of this new feature is deferred to a later patch.
      This is a rather mechanical conversion, just adding the new pass-through
      parameter.  However while at it improve the naming of the fields of all
      compare functions used by hashmaps by ensuring unused parameters are
      prefixed with 'unused_' and naming the parameters what they are (instead
      of 'unused' make it 'unused_keydata').
      Signed-off-by: Stefan Beller's avatarStefan Beller <sbeller@google.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      7663cdc8
  5. 08 Feb, 2017 1 commit
    • Jeff King's avatar
      add oidset API · 29c2bd5f
      Jeff King authored
      This is similar to many of our uses of sha1-array, but it
      overcomes one limitation of a sha1-array: when you are
      de-duplicating a large input with relatively few unique
      entries, sha1-array uses 20 bytes per non-unique entry.
      Whereas this set will use memory linear in the number of
      unique entries (albeit a few more than 20 bytes due to
      hashmap overhead).
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      29c2bd5f