1. 12 Nov, 2018 1 commit
  2. 29 Aug, 2018 1 commit
  3. 26 Apr, 2018 1 commit
  4. 26 Mar, 2018 1 commit
  5. 14 Mar, 2018 1 commit
  6. 14 Feb, 2018 1 commit
  7. 02 Feb, 2018 1 commit
  8. 23 Aug, 2017 1 commit
  9. 08 May, 2017 1 commit
    • brian m. carlson's avatar
      Convert the verify_pack callback to struct object_id · 9fd75046
      brian m. carlson authored
      Make the verify_pack_callback take a pointer to struct object_id.
      Change the pack checksum to use GIT_MAX_RAWSZ, even though it is not
      strictly an object ID.  Doing so ensures resilience against future hash
      size changes, and allows us to remove hard-coded assumptions about how
      big the buffer needs to be.
      Also, use a union to convert the pointer from nth_packed_object_sha1 to
      to a pointer to struct object_id.  This behavior is compatible with GCC
      and clang and explicitly sanctioned by C11.  The alternatives are to
      just perform a cast, which would run afoul of strict aliasing rules, but
      should just work, and changing the pointer into an instance of struct
      object_id and copying the value.  The latter operation could seriously
      bloat memory usage on fsck, which already uses a lot of memory on some
      Signed-off-by: brian m. carlson's avatarbrian m. carlson <sandals@crustytoothpaste.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
  10. 29 Sep, 2016 1 commit
  11. 22 Sep, 2016 1 commit
    • Jeff King's avatar
      verify_packfile: check pack validity before accessing data · a9445d85
      Jeff King authored
      The verify_packfile() does not explicitly open the packfile;
      instead, it starts with a sha1 checksum over the whole pack,
      and relies on use_pack() to open the packfile as a side
      If the pack cannot be opened for whatever reason (either
      because its header information is corrupted, or perhaps
      because a simultaneous repack deleted it), then use_pack()
      will die(), as it has no way to return an error. This is not
      ideal, as verify_packfile() otherwise tries to gently return
      an error (this lets programs like git-fsck go on to check
      other packs).
      Instead, let's check is_pack_valid() up front, and return an
      error if it fails. This will open the pack as a side effect,
      and then use_pack() will later rely on our cached
      descriptor, and avoid calling die().
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
  12. 13 Jul, 2016 1 commit
    • Duy Nguyen's avatar
      fsck: use streaming interface for large blobs in pack · ec9d2249
      Duy Nguyen authored
      For blobs, we want to make sure the on-disk data is not corrupted
      (i.e. can be inflated and produce the expected SHA-1). Blob content is
      opaque, there's nothing else inside to check for.
      For really large blobs, we may want to avoid unpacking the entire blob
      in memory, just to check whether it produces the same SHA-1. On 32-bit
      systems, we may not have enough virtual address space for such memory
      allocation. And even on 64-bit where it's not a problem, allocating a
      lot more memory could result in kicking other parts of systems to swap
      file, generating lots of I/O and slowing everything down.
      For this particular operation, not unpacking the blob and letting
      check_sha1_signature, which supports streaming interface, do the job
      is sufficient. check_sha1_signature() is not shown in the diff,
      unfortunately. But if will be called when "data_valid && !data" is
      We will call the callback function "fn" with NULL as "data". The only
      callback of this function is fsck_obj_buffer(), which does not touch
      "data" at all if it's a blob.
      Signed-off-by: Duy Nguyen's avatarNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
  13. 22 Feb, 2016 1 commit
  14. 01 Dec, 2015 1 commit
    • David Turner's avatar
      verify_pack: do not ignore return value of verification function · 8c24d832
      David Turner authored
      In verify_pack, a caller-supplied verification function is called.
      The function returns an int.  If that return value is non-zero,
      verify_pack should fail.
      The only caller of verify_pack is in builtin/fsck.c, whose verify_fn
      returns a meaningful error code (which was then ignored).  Now, fsck
      might return a different error code (with more detail).  This would
      happen in the unlikely event that a commit or tree that is a valid git
      object but not a valid instance of its type gets into a pack.
      Signed-off-by: 's avatarDavid Turner <dturner@twopensource.com>
      Signed-off-by: 's avatarJeff King <peff@peff.net>
  15. 07 Nov, 2011 3 commits
    • Duy Nguyen's avatar
      fsck: print progress · 1e49f22f
      Duy Nguyen authored
      fsck is usually a long process and it would be nice if it prints
      progress from time to time.
      Progress meter is not printed when --verbose is given because
      --verbose prints a lot, there's no need for "alive" indicator.
      Progress meter may provide "% complete" information but it would
      be lost anyway in the flood of text.
      Signed-off-by: Duy Nguyen's avatarNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
    • Duy Nguyen's avatar
      fsck: avoid reading every object twice · c9486eb0
      Duy Nguyen authored
      During verify_pack() all objects are read for SHA-1 check. Then
      fsck_sha1() is called on every object, which read the object again
      (fsck_sha1 -> parse_object -> read_sha1_file).
      Avoid reading an object twice, do fsck_sha1 while we have an object
      uncompressed data in verify_pack.
      On git.git, with this patch I got:
      $ /usr/bin/time ./git fsck >/dev/null
      98.97user 0.90system 1:40.01elapsed 99%CPU (0avgtext+0avgdata 616624maxresident)k
      0inputs+0outputs (0major+194186minor)pagefaults 0swaps
      Without it:
      $ /usr/bin/time ./git fsck >/dev/null
      231.23user 2.35system 3:53.82elapsed 99%CPU (0avgtext+0avgdata 636688maxresident)k
      0inputs+0outputs (0major+461629minor)pagefaults 0swaps
      Signed-off-by: Duy Nguyen's avatarNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
    • Duy Nguyen's avatar
      verify_packfile(): check as many object as possible in a pack · 47393518
      Duy Nguyen authored
      verify_packfile() checks for whole pack integerity first, then each
      object individually. Once we get past whole pack check, we can
      identify all objects in the pack. If there's an error with one object,
      we should continue to check the next objects to salvage as many
      objects as possible instead of stopping the process.
      Signed-off-by: Duy Nguyen's avatarNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
  16. 10 Jun, 2011 1 commit
    • Junio C Hamano's avatar
      zlib: zlib can only process 4GB at a time · ef49a7a0
      Junio C Hamano authored
      The size of objects we read from the repository and data we try to put
      into the repository are represented in "unsigned long", so that on larger
      architectures we can handle objects that weigh more than 4GB.
      But the interface defined in zlib.h to communicate with inflate/deflate
      limits avail_in (how many bytes of input are we calling zlib with) and
      avail_out (how many bytes of output from zlib are we ready to accept)
      fields effectively to 4GB by defining their type to be uInt.
      In many places in our code, we allocate a large buffer (e.g. mmap'ing a
      large loose object file) and tell zlib its size by assigning the size to
      avail_in field of the stream, but that will truncate the high octets of
      the real size. The worst part of this story is that we often pass around
      z_stream (the state object used by zlib) to keep track of the number of
      used bytes in input/output buffer by inspecting these two fields, which
      practically limits our callchain to the same 4GB limit.
      Wrap z_stream in another structure git_zstream that can express avail_in
      and avail_out in unsigned long. For now, just die() when the caller gives
      a size that cannot be given to a single zlib call. In later patches in the
      series, we would make git_inflate() and git_deflate() internally loop to
      give callers an illusion that our "improved" version of zlib interface can
      operate on a buffer larger than 4GB in one go.
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
  17. 03 Apr, 2011 1 commit
  18. 16 Mar, 2011 1 commit
    • Jonathan Nieder's avatar
      standardize brace placement in struct definitions · 9cba13ca
      Jonathan Nieder authored
      In a struct definitions, unlike functions, the prevailing style is for
      the opening brace to go on the same line as the struct name, like so:
       struct foo {
      	int bar;
      	char *baz;
      Indeed, grepping for 'struct [a-z_]* {$' yields about 5 times as many
      matches as 'struct [a-z_]*$'.
      Linus sayeth:
       Heretic people all over the world have claimed that this inconsistency
       is ...  well ...  inconsistent, but all right-thinking people know that
       (a) K&R are _right_ and (b) K&R are right.
      Signed-off-by: 's avatarJonathan Nieder <jrnieder@gmail.com>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
  19. 22 Aug, 2010 1 commit
  20. 20 Apr, 2010 1 commit
  21. 06 Jun, 2009 1 commit
  22. 03 Oct, 2008 1 commit
    • Nicolas Pitre's avatar
      fix openssl headers conflicting with custom SHA1 implementations · 9126f009
      Nicolas Pitre authored
      On ARM I have the following compilation errors:
          CC fast-import.o
      In file included from cache.h:8,
                       from builtin.h:6,
                       from fast-import.c:142:
      arm/sha1.h:14: error: conflicting types for 'SHA_CTX'
      /usr/include/openssl/sha.h:105: error: previous declaration of 'SHA_CTX' was here
      arm/sha1.h:16: error: conflicting types for 'SHA1_Init'
      /usr/include/openssl/sha.h:115: error: previous declaration of 'SHA1_Init' was here
      arm/sha1.h:17: error: conflicting types for 'SHA1_Update'
      /usr/include/openssl/sha.h:116: error: previous declaration of 'SHA1_Update' was here
      arm/sha1.h:18: error: conflicting types for 'SHA1_Final'
      /usr/include/openssl/sha.h:117: error: previous declaration of 'SHA1_Final' was here
      make: *** [fast-import.o] Error 1
      This is because openssl header files are always included in
      git-compat-util.h since commit 684ec6c6 whenever NO_OPENSSL is not
      set, which somehow brings in <openssl/sha1.h> clashing with the custom
      ARM version.  Compilation of git is probably broken on PPC too for the
      same reason.
      Turns out that the only file requiring openssl/ssl.h and openssl/err.h
      is imap-send.c.  But only moving those problematic includes there
      doesn't solve the issue as it also includes cache.h which brings in the
      conflicting local SHA1 header file.
      As suggested by Jeff King, the best solution is to rename our references
      to SHA1 functions and structure to something git specific, and define those
      according to the implementation used.
      Signed-off-by: 's avatarNicolas Pitre <nico@cam.org>
      Signed-off-by: 's avatarShawn O. Pearce <spearce@spearce.org>
  23. 25 Jun, 2008 3 commits
  24. 24 Jun, 2008 1 commit
  25. 02 Jun, 2008 1 commit
    • Nicolas Pitre's avatar
      make verify-pack a bit more useful with bad packs · 62413604
      Nicolas Pitre authored
      When a pack gets corrupted, its SHA1 checksum will fail.  However, this
      is more useful to let the test go on in order to find the actual
      problem location than only complain about the SHA1 mismatch and
      bail out.
      Also, it is more useful to compare the stored pack SHA1 with the one in
      the index file instead of the computed SHA1 since the computed SHA1
      from a corrupted pack won't match the one stored in the index either.
      Finally a few code and message cleanups were thrown in as a bonus.
      Signed-off-by: 's avatarNicolas Pitre <nico@cam.org>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
  26. 01 Mar, 2008 2 commits
  27. 06 Jun, 2007 1 commit
  28. 27 May, 2007 1 commit
    • Shawn O. Pearce's avatar
      Lazily open pack index files on demand · d079837e
      Shawn O. Pearce authored
      In some repository configurations the user may have many packfiles,
      but all of the recent commits/trees/tags/blobs are likely to
      be in the most recent packfile (the one with the newest mtime).
      It is therefore common to be able to complete an entire operation
      by accessing only one packfile, even if there are 25 packfiles
      available to the repository.
      Rather than opening and mmaping the corresponding .idx file for
      every pack found, we now only open and map the .idx when we suspect
      there might be an object of interest in there.
      Of course we cannot known in advance which packfile contains an
      object, so we still need to scan the entire packed_git list to
      locate anything.  But odds are users want to access objects in the
      most recently created packfiles first, and that may be all they
      ever need for the current operation.
      Junio observed in b867092f that placing recent packfiles before
      older ones can slightly improve access times for recent objects,
      without degrading it for historical object access.
      This change improves upon Junio's observations by trying even harder
      to avoid the .idx files that we won't need.
      Signed-off-by: 's avatarShawn O. Pearce <spearce@spearce.org>
      Signed-off-by: 's avatarJunio C Hamano <junkio@cox.net>
  29. 26 May, 2007 1 commit
  30. 10 Apr, 2007 1 commit
    • Nicolas Pitre's avatar
      get rid of num_packed_objects() · 57059091
      Nicolas Pitre authored
      The coming index format change doesn't allow for the number of objects
      to be determined from the size of the index file directly.  Instead, Let's
      initialize a field in the packed_git structure with the object count when
      the index is validated since the count is always known at that point.
      While at it let's reorder some struct packed_git fields to avoid padding
      due to needed 64-bit alignment for some of them.
      Signed-off-by: 's avatarNicolas Pitre <nico@cam.org>
      Signed-off-by: 's avatarJunio C Hamano <junkio@cox.net>
  31. 05 Apr, 2007 1 commit
  32. 17 Mar, 2007 1 commit
    • Nicolas Pitre's avatar
      [PATCH] clean up pack index handling a bit · 42873078
      Nicolas Pitre authored
      Especially with the new index format to come, it is more appropriate
      to encapsulate more into check_packed_git_idx() and assume less of the
      index format in struct packed_git.
      To that effect, the index_base is renamed to index_data with void * type
      so it is not used directly but other pointers initialized with it. This
      allows for a couple pointer cast removal, as well as providing a better
      generic name to grep for when adding support for new index versions or
      And index_data is declared const too while at it.
      Signed-off-by: 's avatarNicolas Pitre <nico@cam.org>
      Signed-off-by: 's avatarJunio C Hamano <junkio@cox.net>
  33. 07 Mar, 2007 2 commits
    • Shawn O. Pearce's avatar
      Use off_t when we really mean a file offset. · c4001d92
      Shawn O. Pearce authored
      Not all platforms have declared 'unsigned long' to be a 64 bit value,
      but we want to support a 64 bit packfile (or close enough anyway)
      in the near future as some projects are getting large enough that
      their packed size exceeds 4 GiB.
      By using off_t, the POSIX type that is declared to mean an offset
      within a file, we support whatever maximum file size the underlying
      operating system will handle.  For most modern systems this is up
      around 2^60 or higher.
      Signed-off-by: 's avatarShawn O. Pearce <spearce@spearce.org>
      Signed-off-by: 's avatarJunio C Hamano <junkio@cox.net>
    • Shawn O. Pearce's avatar
      Use uint32_t for all packed object counts. · 326bf396
      Shawn O. Pearce authored
      As we permit up to 2^32-1 objects in a single packfile we cannot
      use a signed int to represent the object offset within a packfile,
      after 2^31-1 objects we will start seeing negative indexes and
      error out or compute bad addresses within the mmap'd index.
      This is a minor cleanup that does not introduce any significant
      logic changes.  It is roach free.
      Signed-off-by: 's avatarShawn O. Pearce <spearce@spearce.org>
      Signed-off-by: 's avatarJunio C Hamano <junkio@cox.net>
  34. 27 Feb, 2007 1 commit
    • Nicolas Pitre's avatar
      convert object type handling from a string to a number · 21666f1a
      Nicolas Pitre authored
      We currently have two parallel notation for dealing with object types
      in the code: a string and a numerical value.  One of them is obviously
      redundent, and the most used one requires more stack space and a bunch
      of strcmp() all over the place.
      This is an initial step for the removal of the version using a char array
      found in object reading code paths.  The patch is unfortunately large but
      there is no sane way to split it in smaller parts without breaking the
      Signed-off-by: 's avatarNicolas Pitre <nico@cam.org>
      Signed-off-by: 's avatarJunio C Hamano <junkio@cox.net>