This project is mirrored from https://github.com/git/git. Updated .
  1. 18 Sep, 2014 1 commit
  2. 07 Jul, 2014 1 commit
  3. 02 Jun, 2014 1 commit
    • René Scharfe's avatar
      pack-objects: use free()+xcalloc() instead of xrealloc()+memset() · fb799474
      René Scharfe authored
      Whenever the hash table becomes too small then its size is increased,
      the original part (and the added space) is zerod out using memset(),
      and the table is rebuilt from scratch.
      
      Simplify this proceess by returning the old memory using free() and
      allocating the new buffer using xcalloc(), which already clears the
      buffer for us.  That way we avoid copying the old hash table contents
      needlessly inside xrealloc().
      
      While at it, use the first array member with sizeof instead of a
      specific type.  The old code used uint32_t and int, while index is
      actually an array of int32_t.  Their sizes are the same basically
      everywhere, so it's not actually a problem, but the new code is
      cleaner and doesn't have to be touched should the type be changed.
      Signed-off-by: default avatarRene Scharfe <l.s.r@web.de>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      fb799474
  4. 24 Oct, 2013 1 commit
    • Vicent Marti's avatar
      pack-objects: refactor the packing list · 2834bc27
      Vicent Marti authored
      The hash table that stores the packing list for a given `pack-objects`
      run was tightly coupled to the pack-objects code.
      
      In this commit, we refactor the hash table and the underlying storage
      array into a `packing_data` struct. The functionality for accessing and
      adding entries to the packing list is hence accessible from other parts
      of Git besides the `pack-objects` builtin.
      
      This refactoring is a requirement for further patches in this series
      that will require accessing the commit packing list from outside of
      `pack-objects`.
      
      The hash table implementation has been minimally altered: we now
      use table sizes which are always a power of two, to ensure a uniform
      index distribution in the array.
      Signed-off-by: default avatarVicent Marti <tanoku@gmail.com>
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      2834bc27
  5. 04 Aug, 2006 1 commit
  6. 25 Jul, 2006 1 commit
  7. 24 Jul, 2006 1 commit
  8. 10 Jul, 2006 1 commit
  9. 01 Jul, 2006 1 commit
  10. 30 Jun, 2006 1 commit
  11. 29 Jun, 2006 1 commit
  12. 21 Jun, 2006 1 commit
  13. 20 Jun, 2006 1 commit
  14. 06 Jun, 2006 1 commit
    • Linus Torvalds's avatar
      pack-objects: improve path grouping heuristics. · ce0bd642
      Linus Torvalds authored
      This trivial patch not only simplifies the name hashing, it actually
      improves packing for both git and the kernel.
      
      The git archive pack shrinks from 6824090->6622627 bytes (a 3%
      improvement), and the kernel pack shrinks from 108756213 to 108219021 (a
      mere 0.5% improvement, but still, it's an improvement from making the
      hashing much simpler!)
      
      We just create a 32-bit hash, where we "age" previous characters by two
      bits, so the last characters in a filename count most. So when we then
      compare the hashes in the sort routine, filenames that end the same way
      sort the same way.
      
      It takes the subdirectory into account (unless the filename is > 16
      characters), but files with the same name within the same subdirectory
      will obviously sort closer than files in different subdirectories.
      
      And, incidentally (which is why I tried the hash change in the first
      place, of course) builtin-rev-list.c will sort fairly close to rev-list.c.
      
      And no, it's not a "good hash" in the sense of being secure or unique, but
      that's not what we're looking for. The whole "hash" thing is misnamed
      here. It's not so much a hash as a "sorting number".
      
      [jc: rolled in simplification for computing the sorting number
       computation for thin pack base objects]
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      ce0bd642
  15. 31 May, 2006 1 commit
    • Linus Torvalds's avatar
      tree_entry(): new tree-walking helper function · 4c068a98
      Linus Torvalds authored
      This adds a "tree_entry()" function that combines the common operation of
      doing a "tree_entry_extract()" + "update_tree_entry()".
      
      It also has a simplified calling convention, designed for simple loops
      that traverse over a whole tree: the arguments are pointers to the tree
      descriptor and a name_entry structure to fill in, and it returns a boolean
      "true" if there was an entry left to be gotten in the tree.
      
      This allows tree traversal with
      
      	struct tree_desc desc;
      	struct name_entry entry;
      
      	desc.buf = tree->buffer;
      	desc.size = tree->size;
      	while (tree_entry(&desc, &entry) {
      		... use "entry.{path, sha1, mode, pathlen}" ...
      	}
      
      which is not only shorter than writing it out in full, it's hopefully less
      error prone too.
      
      [ It's actually a tad faster too - we don't need to recalculate the entry
        pathlength in both extract and update, but need to do it only once.
        Also, some callers can avoid doing a "strlen()" on the result, since
        it's returned as part of the name_entry structure.
      
        However, by now we're talking just 1% speedup on "git-rev-list --objects
        --all", and we're definitely at the point where tree walking is no
        longer the issue any more. ]
      
      NOTE! Not everybody wants to use this new helper function, since some of
      the tree walkers very much on purpose do the descriptor update separately
      from the entry extraction. So the "extract + update" sequence still
      remains as the core sequence, this is just a simplified interface.
      
      We should probably add a silly two-line inline helper function for
      initializing the descriptor from the "struct tree" too, just to cut down
      on the noise from that common "desc" initializer.
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      4c068a98
  16. 16 May, 2006 1 commit
  17. 15 May, 2006 3 commits
    • Junio C Hamano's avatar
      Fix pack-index issue on 64-bit platforms a bit more portably. · 1b9bc5a7
      Junio C Hamano authored
      Apparently <stdint.h> is not enough for uint32_t on OpenBSD; use
      "unsigned int" -- hopefully that would stay 32-bit on every
      platform we care about, at least until we update the pack-index
      file format.
      
      Our sha1 routines optimized for architectures use uint32_t and
      expects '#include <stdint.h>' to be enough, so OpenBSD on arm or
      ppc might have similar issues down the road, I dunno.
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      1b9bc5a7
    • Nicolas Pitre's avatar
      pack-object: slightly more efficient · ff45715c
      Nicolas Pitre authored
      Avoid creating a delta index for objects with maximum depth since they
      are not going to be used as delta base anyway.  This also reduce peak
      memory usage slightly as the current object's delta index is not useful
      until the next object in the loop is considered for deltification. This
      saves a bit more than 1% on CPU usage.
      Signed-off-by: default avatarNicolas Pitre <nico@cam.org>
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      ff45715c
    • Nicolas Pitre's avatar
      simple euristic for further free packing improvements · 4e8da195
      Nicolas Pitre authored
      Given that the early eviction of objects with maximum delta depth
      may exhibit bad packing on its own, why not considering a bias against
      deep base objects in try_delta() to mitigate that bad behavior.
      
      This patch adjust the MAX_size allowed for a delta based on the depth of
      the base object as well as enabling the early eviction of max depth
      objects from the object window.  When used separately, those two things
      produce slightly better and much worse results respectively.  But their
      combined effect is a surprising significant packing improvement.
      
      With this really simple patch the GIT repo gets nearly 15% smaller, and
      the Linux kernel repo about 5% smaller, with no significantly measurable
      CPU usage difference.
      Signed-off-by: default avatarNicolas Pitre <nico@cam.org>
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      4e8da195
  18. 14 May, 2006 1 commit
  19. 13 May, 2006 1 commit
    • Dennis Stosberg's avatar
      Fix git-pack-objects for 64-bit platforms · 66561f5a
      Dennis Stosberg authored
      The offset of an object in the pack is recorded as a 4-byte integer
      in the index file.  When reading the offset from the mmap'ed index
      in prepare_pack_revindex(), the address is dereferenced as a long*.
      This works fine as long as the long type is four bytes wide.  On
      NetBSD/sparc64, however, a long is 8 bytes wide and so dereferencing
      the offset produces garbage.
      
      [jc: taking suggestion by Linus to use uint32_t]
      Signed-off-by: default avatarDennis Stosberg <dennis@stosberg.net>
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      66561f5a
  20. 05 May, 2006 1 commit
    • Junio C Hamano's avatar
      pack-object: squelch eye-candy on non-tty · 86118bcb
      Junio C Hamano authored
      One of my post-update scripts runs a git-fetch into a separate
      repository and sends the results back to me (2>&1); I end up
      getting this in the mail:
      
          Generating pack...
          Done counting 180 objects.
          Result has 131 objects.
          Deltifying 131 objects.
             0% (0/131) done^M   1% (2/131) done^M...
      
      This defaults not to do the progress report when not on a tty.
      
      You could give --progress to force the progress report, but
      let's not bother even documenting it nor mentioning it in the
      usage string.
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      86118bcb
  21. 28 Apr, 2006 1 commit
  22. 27 Apr, 2006 1 commit
  23. 21 Apr, 2006 2 commits
  24. 17 Apr, 2006 1 commit
  25. 07 Apr, 2006 1 commit
    • Junio C Hamano's avatar
      Thin pack generation: optimization. · 5379a5c5
      Junio C Hamano authored
      Jens Axboe noticed that recent "git push" has become very slow
      since we made --thin transfer the default.
      
      Thin pack generation to push a handful revisions that touch
      relatively small number of paths out of huge tree was stupid; it
      registered _everything_ from the excluded revisions.  As a
      result, "Counting objects" phase was unnecessarily expensive.
      
      This changes the logic to register the blobs and trees from
      excluded revisions only for paths we are actually going to send
      to the other end.
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      5379a5c5
  26. 04 Apr, 2006 2 commits
  27. 02 Apr, 2006 2 commits
  28. 30 Mar, 2006 1 commit
    • Junio C Hamano's avatar
      tree/diff header cleanup. · 1b0c7174
      Junio C Hamano authored
      Introduce tree-walk.[ch] and move "struct tree_desc" and
      associated functions from various places.
      
      Rename DIFF_FILE_CANON_MODE(mode) macro to canon_mode(mode) and
      move it to cache.h.  This macro returns the canonicalized
      st_mode value in the host byte order for files, symlinks and
      directories -- to be compared with a tree_desc entry.
      create_ce_mode(mode) in cache.h is similar but is intended to be
      used for index entries (so it does not work for directories) and
      returns the value in the network byte order.
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      1b0c7174
  29. 06 Mar, 2006 1 commit
    • Junio C Hamano's avatar
      pack-objects: simplify "thin" pack. · 70ca1a3f
      Junio C Hamano authored
      There was a misguided logic to overly prefer using objects that
      we are not going to pack as the base object.  This was
      unnecessary.  It does not matter to the unpacking side where the
      base object is -- it matters more to make the resulting delta
      smaller.
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      70ca1a3f
  30. 02 Mar, 2006 2 commits
    • Nicolas Pitre's avatar
      diff-delta: allow reusing of the reference buffer index · 38fd0721
      Nicolas Pitre authored
      When a reference buffer is used multiple times then its index can be
      computed only once and reused multiple times.  This patch adds an extra
      pointer to a pointer argument (from_index) to diff_delta() for this.
      
      If from_index is NULL then everything is like before.
      
      If from_index is non NULL and *from_index is NULL then the index is
      created and its location stored to *from_index.  In this case the caller
      has the responsibility to free the memory pointed to by *from_index.
      
      If from_index and *from_index are non NULL then the index is reused as
      is.
      
      This currently saves about 10% of CPU time to repack the git archive.
      Signed-off-by: default avatarNicolas Pitre <nico@cam.org>
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      38fd0721
    • Luck, Tony's avatar
      Re-fix compilation warnings. · 2b74cffa
      Luck, Tony authored
      Commit 8fcf1ad9 has a
      combination of double cast and Andreas' switch to using
      unsigned long ... just the latter is sufficient (and a lot less
      ugly than using the double cast).
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      2b74cffa
  31. 26 Feb, 2006 1 commit
  32. 25 Feb, 2006 1 commit
    • Luck, Tony's avatar
      fix warning from pack-objects.c · 8fcf1ad9
      Luck, Tony authored
      When compiling on ia64 I get this warning (from gcc 3.4.3):
      
      gcc -o pack-objects.o -c -g -O2 -Wall -DSHA1_HEADER='<openssl/sha.h>'  pack-objects.c
      pack-objects.c: In function `pack_revindex_ix':
      pack-objects.c:94: warning: cast from pointer to integer of different size
      
      A double cast (first to long, then to int) shuts gcc up, but is there
      a better way?
      
      [jc: Andreas Ericsson suggests to use ulong instead. ]
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      8fcf1ad9
  33. 24 Feb, 2006 2 commits
    • Junio C Hamano's avatar
      pack-objects: hash basename and direname a bit differently. · eeef7135
      Junio C Hamano authored
      ...so that "Makefile"s from different revs are sorted together,
      separate from "t/Makefile"s, but close enough.
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      eeef7135
    • Junio C Hamano's avatar
      pack-objects: allow "thin" packs to exceed depth limits · b76f6b62
      Junio C Hamano authored
      When creating a new pack to be used in .git/objects/pack/
      directory, we carefully count the depth of deltified objects to
      be reused, so that the generated pack does not to exceed the
      specified depth limit for runtime efficiency.  However, when we
      are generating a thin pack that does not contain base objects,
      such a pack can only be used during network transfer that is
      expanded on the other end upon reception, so being careful and
      artificially cutting the delta chain does not buy us anything
      except increased bandwidth requirement.  This patch disables the
      delta chain depth limit check when reusing an existing delta.
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      b76f6b62