1. 14 Feb, 2019 1 commit
    • Jeff King's avatar
      pack-objects: drop unused parameter from oe_map_new_pack() · c409d108
      Jeff King authored
      Since 43fa44fa (pack-objects: move in_pack out of struct object_entry,
      2018-04-14), we store the source pack for each object as a small index
      rather than as a pointer. When we see a new pack that has no allocated
      index, we fall back to generating an array of pointers by calling
      oe_map_new_pack().
      
      Perhaps counter-intuitively, that function does not need to actually see
      our new index-less pack. It only allocates and populates the array with
      the existing packs, after which oe_set_in_pack() actually adds the new
      pack to the array.
      
      Let's drop the unused "struct packed_git" argument to oe_map_new_pack()
      to avoid confusion.
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      c409d108
  2. 28 Jan, 2019 2 commits
  3. 21 Nov, 2018 2 commits
    • Jeff King's avatar
      pack-objects: zero-initialize tree_depth/layer arrays · e159b810
      Jeff King authored
      Commit 108f5303 (pack-objects: move tree_depth into 'struct
      packing_data', 2018-08-16) started maintaining a tree_depth array that
      matches the "objects" array. We extend the array when:
      
        1. The objects array is extended, in which case we use realloc to
           extend the tree_depth array.
      
        2. A caller asks to store a tree_depth for object N, and this is the
           first such request; we create the array from scratch and store the
           value for N.
      
      In the latter case, though, we use regular xmalloc(), and the depth
      values for any objects besides N is undefined. This happens to not
      trigger a bug with the current code, but the reasons are quite subtle:
      
       - we never ask about the depth for any object with index i < N. This is
         because we store the depth immediately for all trees and blobs. So
         any such "i" must be a non-tree, and therefore we will never need to
         care about its depth (in fact, we really only care about the depth of
         trees).
      
       - there are no objects at this point with index i > N, because we
         always fill in the depth for a tree immediately after its object
         entry is created (we may still allocate uninitialized depth entries,
         but they'll be initialized by packlist_alloc() when it initializes
         the entry in the "objects" array).
      
      So it works, but only by chance. To be defensive, let's zero the array,
      which matches the "unset" values which would be handed out by
      oe_tree_depth() already.
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      e159b810
    • Jeff King's avatar
      pack-objects: fix tree_depth and layer invariants · bc35ac1a
      Jeff King authored
      Commit 108f5303 (pack-objects: move tree_depth into 'struct
      packing_data', 2018-08-16) dynamically manages a tree_depth array in
      packing_data that maintains one of these invariants:
      
        1. tree_depth is NULL (i.e., the requested options don't require us to
           track tree depths)
      
        2. tree_depth is non-NULL and has as many entries as the "objects"
           array
      
      We maintain (2) by:
      
        a. When the objects array grows, grow tree_depth to the same size
           (unless it's NULL, in which case we can leave it).
      
        b. When a caller asks to set a depth via oe_set_tree_depth(), if
           tree_depth is NULL we allocate it.
      
      But in (b), we use the number of stored objects, _not_ the allocated
      size of the objects array. So we can run into a situation like this:
      
        1. packlist_alloc() needs to store the Nth object, so it grows the
           objects array to M, where M > N.
      
        2. oe_set_tree_depth() wants to store a depth, so it allocates an
           array of length N. Now we've violated our invariant.
      
        3. packlist_alloc() needs to store the N+1th object. But it _doesn't_
           grow the objects array, since N <= M still holds. We try to assign
           to tree_depth[N+1], which is out of bounds.
      
      That doesn't happen in our test scripts, because the repositories they
      use are so small, but it's easy to trigger by running:
      
        echo HEAD | git pack-objects --revs --delta-islands --stdout >/dev/null
      
      in any reasonably-sized repo (like git.git).
      
      We can fix it by always growing the array to match pack->nr_alloc, not
      pack->nr_objects. Likewise for the "layer" array from fe0ac2fb
      (pack-objects: move 'layer' into 'struct packing_data', 2018-08-16),
      which has the same bug.
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      bc35ac1a
  4. 12 Nov, 2018 1 commit
  5. 05 Nov, 2018 1 commit
  6. 19 Oct, 2018 1 commit
  7. 21 Aug, 2018 1 commit
    • Jeff King's avatar
      pack-objects: reuse on-disk deltas for thin "have" objects · 6a1e32d5
      Jeff King authored
      When we serve a fetch, we pass the "wants" and "haves" from
      the fetch negotiation to pack-objects. That tells us not
      only which objects we need to send, but we also use the
      boundary commits as "preferred bases": their trees and blobs
      are candidates for delta bases, both for reusing on-disk
      deltas and for finding new ones.
      
      However, this misses some opportunities. Modulo some special
      cases like shallow or partial clones, we know that every
      object reachable from the "haves" could be a preferred base.
      We don't use all of them for two reasons:
      
        1. It's expensive to traverse the whole history and
           enumerate all of the objects the other side has.
      
        2. The delta search is expensive, so we want to keep the
           number of candidate bases sane. The boundary commits
           are the most likely to work.
      
      When we have reachability bitmaps, though, reason 1 no
      longer applies. We can efficiently compute the set of
      reachable objects on the other side (and in fact already did
      so as part of the bitmap set-difference to get the list of
      interesting objects). And using this set conveniently
      covers the shallow and partial cases, since we have to
      disable the use of bitmaps for those anyway.
      
      The second reason argues against using these bases in the
      search for new deltas. But there's one case where we can use
      this information for free: when we have an existing on-disk
      delta that we're considering reusing, we can do so if we
      know the other side has the base object. This in fact saves
      time during the delta search, because it's one less delta we
      have to compute.
      
      And that's exactly what this patch does: when we're
      considering whether to reuse an on-disk delta, if bitmaps
      tell us the other side has the object (and we're making a
      thin-pack), then we reuse it.
      
      Here are the results on p5311 using linux.git, which
      simulates a client fetching after `N` days since their last
      fetch:
      
       Test                         origin              HEAD
       --------------------------------------------------------------------------
       5311.3: server   (1 days)    0.27(0.27+0.04)     0.12(0.09+0.03) -55.6%
       5311.4: size     (1 days)               0.9M              237.0K -73.7%
       5311.5: client   (1 days)    0.04(0.05+0.00)     0.10(0.10+0.00) +150.0%
       5311.7: server   (2 days)    0.34(0.42+0.04)     0.13(0.10+0.03) -61.8%
       5311.8: size     (2 days)               1.5M              347.7K -76.5%
       5311.9: client   (2 days)    0.07(0.08+0.00)     0.16(0.15+0.01) +128.6%
       5311.11: server   (4 days)   0.56(0.77+0.08)     0.13(0.10+0.02) -76.8%
       5311.12: size     (4 days)              2.8M              566.6K -79.8%
       5311.13: client   (4 days)   0.13(0.15+0.00)     0.34(0.31+0.02) +161.5%
       5311.15: server   (8 days)   0.97(1.39+0.11)     0.30(0.25+0.05) -69.1%
       5311.16: size     (8 days)              4.3M                1.0M -76.0%
       5311.17: client   (8 days)   0.20(0.22+0.01)     0.53(0.52+0.01) +165.0%
       5311.19: server  (16 days)   1.52(2.51+0.12)     0.30(0.26+0.03) -80.3%
       5311.20: size    (16 days)              8.0M                2.0M -74.5%
       5311.21: client  (16 days)   0.40(0.47+0.03)     1.01(0.98+0.04) +152.5%
       5311.23: server  (32 days)   2.40(4.44+0.20)     0.31(0.26+0.04) -87.1%
       5311.24: size    (32 days)             14.1M                4.1M -70.9%
       5311.25: client  (32 days)   0.70(0.90+0.03)     1.81(1.75+0.06) +158.6%
       5311.27: server  (64 days)   11.76(26.57+0.29)   0.55(0.50+0.08) -95.3%
       5311.28: size    (64 days)             89.4M               47.4M -47.0%
       5311.29: client  (64 days)   5.71(9.31+0.27)     15.20(15.20+0.32) +166.2%
       5311.31: server (128 days)   16.15(36.87+0.40)   0.91(0.82+0.14) -94.4%
       5311.32: size   (128 days)            134.8M              100.4M -25.5%
       5311.33: client (128 days)   9.42(16.86+0.49)    25.34(25.80+0.46) +169.0%
      
      In all cases we save CPU time on the server (sometimes
      significant) and the resulting pack is smaller. We do spend
      more CPU time on the client side, because it has to
      reconstruct more deltas. But that's the right tradeoff to
      make, since clients tend to outnumber servers. It just means
      the thin pack mechanism is doing its job.
      
      From the user's perspective, the end-to-end time of the
      operation will generally be faster. E.g., in the 128-day
      case, we saved 15s on the server at a cost of 16s on the
      client. Since the resulting pack is 34MB smaller, this is a
      net win if the network speed is less than 270Mbit/s. And
      that's actually the worst case. The 64-day case saves just
      over 11s at a cost of just under 11s. So it's a slight win
      at any network speed, and the 40MB saved is pure bonus. That
      trend continues for the smaller fetches.
      
      The implementation itself is mostly straightforward, with
      the new logic going into check_object(). But there are two
      tricky bits.
      
      The first is that check_object() needs access to the
      relevant information (the thin flag and bitmap result). We
      can do this by pushing these into program-lifetime globals.
      
      The second is that the rest of the code assumes that any
      reused delta will point to another "struct object_entry" as
      its base. But of course the case we are interested in here
      is the one where don't have such an entry!
      
      I looked at a number of options that didn't quite work:
      
       - we could use a flag to signal a reused delta, but it's
         not a single bit. We have to actually store the oid of
         the base, which is normally done by pointing to the
         existing object_entry. And we'd have to modify all the
         code which looks at deltas.
      
       - we could add the reused bases to the end of the existing
         object_entry array. While this does create some extra
         work as later stages consider the extra entries, it's
         actually not too bad (we're not sending them, so they
         don't cost much in the delta search, and at most we'd
         have 2*N of them).
      
         But there's a more subtle problem. Adding to the existing
         array means we might need to grow it with realloc, which
         could move the earlier entries around. While many of the
         references to other entries are done by integer index,
         some (including ones on the stack) use pointers, which
         would become invalidated.
      
         This isn't insurmountable, but it would require quite a
         bit of refactoring (and it's hard to know that you've got
         it all, since it may work _most_ of the time and then
         fail subtly based on memory allocation patterns).
      
       - we could allocate a new one-off entry for the base. In
         fact, this is what an earlier version of this patch did.
         However, since the refactoring brought in by ad635e82
         (Merge branch 'nd/pack-objects-pack-struct', 2018-05-23),
         the delta_idx code requires that both entries be in the
         main packing list.
      
      So taking all of those options into account, what I ended up
      with is a separate list of "external bases" that are not
      part of the main packing list. Each delta entry that points
      to an external base has a single-bit flag to do so; we have a
      little breathing room in the bitfield section of
      object_entry.
      
      This lets us limit the change primarily to the oe_delta()
      and oe_set_delta_ext() functions. And as a bonus, most of
      the rest of the code does not consider these dummy entries
      at all, saving both runtime CPU and code complexity.
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      6a1e32d5
  8. 16 Aug, 2018 3 commits
    • Christian Couder's avatar
      pack-objects: move 'layer' into 'struct packing_data' · fe0ac2fb
      Christian Couder authored
      This reduces the size of 'struct object_entry' from 88 bytes
      to 80 and therefore makes packing objects more efficient.
      
      For example on a Linux repo with 12M objects,
      `git pack-objects --all` needs extra 96MB memory even if the
      layer feature is not used.
      Helped-by: 's avatarJeff King <peff@peff.net>
      Helped-by: Duy Nguyen's avatarDuy Nguyen <pclouds@gmail.com>
      Signed-off-by: Christian Couder's avatarChristian Couder <chriscool@tuxfamily.org>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      fe0ac2fb
    • Christian Couder's avatar
      pack-objects: move tree_depth into 'struct packing_data' · 108f5303
      Christian Couder authored
      This reduces the size of 'struct object_entry' and therefore
      makes packing objects more efficient.
      
      This also renames cmp_tree_depth() into tree_depth_compare(),
      as it is more modern to have the name of the compare functions
      end with "compare".
      Helped-by: 's avatarJeff King <peff@peff.net>
      Helped-by: Duy Nguyen's avatarDuy Nguyen <pclouds@gmail.com>
      Signed-off-by: Christian Couder's avatarChristian Couder <chriscool@tuxfamily.org>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      108f5303
    • Jeff King's avatar
      Add delta-islands.{c,h} · c8d521fa
      Jeff King authored
      Hosting providers that allow users to "fork" existing
      repos want those forks to share as much disk space as
      possible.
      
      Alternates are an existing solution to keep all the
      objects from all the forks into a unique central repo,
      but this can have some drawbacks. Especially when
      packing the central repo, deltas will be created
      between objects from different forks.
      
      This can make cloning or fetching a fork much slower
      and much more CPU intensive as Git might have to
      compute new deltas for many objects to avoid sending
      objects from a different fork.
      
      Because the inefficiency primarily arises when an
      object is deltified against another object that does
      not exist in the same fork, we partition objects into
      sets that appear in the same fork, and define
      "delta islands". When finding delta base, we do not
      allow an object outside the same island to be
      considered as its base.
      
      So "delta islands" is a way to store objects from
      different forks in the same repo and packfile without
      having deltas between objects from different forks.
      
      This patch implements the delta islands mechanism in
      "delta-islands.{c,h}", but does not yet make use of it.
      
      A few new fields are added in 'struct object_entry'
      in "pack-objects.h" though.
      
      The documentation will follow in a patch that actually
      uses delta islands in "builtin/pack-objects.c".
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: Christian Couder's avatarChristian Couder <chriscool@tuxfamily.org>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      c8d521fa
  9. 15 Aug, 2018 1 commit
  10. 23 Jul, 2018 1 commit
    • Duy Nguyen's avatar
      pack-objects: fix performance issues on packing large deltas · 9ac3f0e5
      Duy Nguyen authored
      Let's start with some background about oe_delta_size() and
      oe_set_delta_size(). If you already know, skip the next paragraph.
      
      These two are added in 0aca34e8 (pack-objects: shrink delta_size
      field in struct object_entry - 2018-04-14) to help reduce 'struct
      object_entry' size. The delta size field in this struct is reduced to
      only contain max 1MB. So if any new delta is produced and larger than
      1MB, it's dropped because we can't really save such a large size
      anywhere. Fallback is provided in case existing packfiles already have
      large deltas, then we can retrieve it from the pack.
      
      While this should help small machines repacking large repos without
      large deltas (i.e. less memory pressure), dropping large deltas during
      the delta selection process could end up with worse pack files. And if
      existing packfiles already have >1MB delta and pack-objects is
      instructed to not reuse deltas, all of them will be dropped on the
      floor, and the resulting pack would be definitely bigger.
      
      There is also a regression in terms of CPU/IO if we have large on-disk
      deltas because fallback code needs to parse the pack every time the
      delta size is needed and just access to the mmap'd pack data is enough
      for extra page faults when memory is under pressure.
      
      Both of these issues were reported on the mailing list. Here's some
      numbers for comparison.
      
          Version  Pack (MB)  MaxRSS(kB)  Time (s)
          -------  ---------  ----------  --------
           2.17.0     5498     43513628    2494.85
           2.18.0    10531     40449596    4168.94
      
      This patch provides a better fallback that is
      
      - cheaper in terms of cpu and io because we won't have to read
        existing pack files as much
      
      - better in terms of pack size because the pack heuristics is back to
        2.17.0 time, we do not drop large deltas at all
      
      If we encounter any delta (on-disk or created during try_delta phase)
      that is larger than the 1MB limit, we stop using delta_size_ field for
      this because it can't contain such size anyway. A new array of delta
      size is dynamically allocated and can hold all the deltas that 2.17.0
      can. This array only contains delta sizes that delta_size_ can't
      contain.
      
      With this, we do not have to drop deltas in try_delta() anymore. Of
      course the downside is we use slightly more memory, even compared to
      2.17.0. But since this is considered an uncommon case, a bit more
      memory consumption should not be a problem.
      
      Delta size limit is also raised from 1MB to 16MB to better cover
      common case and avoid that extra memory consumption (99.999% deltas in
      this reported repo are under 12MB; Jeff noted binary artifacts topped
      out at about 3MB in some other private repos). Other fields are
      shuffled around to keep this struct packed tight. We don't use more
      memory in common case even with this limit update.
      
      A note about thread synchronization. Since this code can be run in
      parallel during delta searching phase, we need a mutex. The realloc
      part in packlist_alloc() is not protected because it only happens
      during the object counting phase, which is always single-threaded.
      
      Access to e->delta_size_ (and by extension
      pack->delta_size[e - pack->objects]) is unprotected as before, the
      thread scheduler in pack-objects must make sure "e" is never updated
      by two different threads.
      
      The area under the new lock is as small as possible, avoiding locking
      at all in common case, since lock contention with high thread count
      could be expensive (most blobs are small enough that delta compute
      time is short and we end up taking the lock very often). The previous
      attempt to always hold a lock in oe_delta_size() and
      oe_set_delta_size() increases execution time by 33% when repacking
      linux.git with with 40 threads.
      Reported-by: 's avatarElijah Newren <newren@gmail.com>
      Helped-by: 's avatarElijah Newren <newren@gmail.com>
      Helped-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: Duy Nguyen's avatarNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      9ac3f0e5
  11. 16 Apr, 2018 13 commits
    • Duy Nguyen's avatar
      gc --auto: exclude base pack if not enough mem to "repack -ad" · 9806f5a7
      Duy Nguyen authored
      pack-objects could be a big memory hog especially on large repos,
      everybody knows that. The suggestion to stick a .keep file on the
      giant base pack to avoid this problem is also known for a long time.
      
      Recent patches add an option to do just this, but it has to be either
      configured or activated manually. This patch lets `git gc --auto`
      activate this mode automatically when it thinks `repack -ad` will use
      a lot of memory and start affecting the system due to swapping or
      flushing OS cache.
      
      gc --auto decides to do this based on an estimation of pack-objects
      memory usage, which is quite accurate at least for the heap part, and
      whether that fits in half of system memory (the assumption here is for
      desktop environment where there are many other applications running).
      
      This mechanism only kicks in if gc.bigBasePackThreshold is not configured.
      If it is, it is assumed that the user already knows what they want.
      Signed-off-by: Duy Nguyen's avatarNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      9806f5a7
    • Duy Nguyen's avatar
      pack-objects: reorder members to shrink struct object_entry · 3b13a5f2
      Duy Nguyen authored
      Previous patches leave lots of holes and padding in this struct. This
      patch reorders the members and shrinks the struct down to 80 bytes
      (from 136 bytes on 64-bit systems, before any field shrinking is done)
      with 16 bits to spare (and a couple more in in_pack_header_size when
      we really run out of bits).
      
      This is the last in a series of memory reduction patches (see
      "pack-objects: a bit of document about struct object_entry" for the
      first one).
      
      Overall they've reduced repack memory size on linux-2.6.git from
      3.747G to 3.424G, or by around 320M, a decrease of 8.5%. The runtime
      of repack has stayed the same throughout this series. Ævar's testing
      on a big monorepo he has access to (bigger than linux-2.6.git) has
      shown a 7.9% reduction, so the overall expected improvement should be
      somewhere around 8%.
      
      See 87po42cwql.fsf@evledraar.gmail.com on-list
      (https://public-inbox.org/git/87po42cwql.fsf@evledraar.gmail.com/) for
      more detailed numbers and a test script used to produce the numbers
      cited above.
      Signed-off-by: Duy Nguyen's avatarNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      3b13a5f2
    • Duy Nguyen's avatar
      pack-objects: shrink delta_size field in struct object_entry · 0aca34e8
      Duy Nguyen authored
      Allowing a delta size of 64 bits is crazy. Shrink this field down to
      20 bits with one overflow bit.
      
      If we find an existing delta larger than 1MB, we do not cache
      delta_size at all and will get the value from oe_size(), potentially
      from disk if it's larger than 4GB.
      
      Note, since DELTA_SIZE() is used in try_delta() code, it must be
      thread-safe. Luckily oe_size() does guarantee this so we it is
      thread-safe.
      Signed-off-by: Duy Nguyen's avatarNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      0aca34e8
    • Duy Nguyen's avatar
      pack-objects: shrink size field in struct object_entry · ac77d0c3
      Duy Nguyen authored
      It's very very rare that an uncompressed object is larger than 4GB
      (partly because Git does not handle those large files very well to
      begin with). Let's optimize it for the common case where object size
      is smaller than this limit.
      
      Shrink size field down to 31 bits and one overflow bit. If the size is
      too large, we read it back from disk. As noted in the previous patch,
      we need to return the delta size instead of canonical size when the
      to-be-reused object entry type is a delta instead of a canonical one.
      
      Add two compare helpers that can take advantage of the overflow
      bit (e.g. if the file is 4GB+, chances are it's already larger than
      core.bigFileThreshold and there's no point in comparing the actual
      value).
      
      Another note about oe_get_size_slow(). This function MUST be thread
      safe because SIZE() macro is used inside try_delta() which may run in
      parallel. Outside parallel code, no-contention locking should be dirt
      cheap (or insignificant compared to i/o access anyway). To exercise
      this code, it's best to run the test suite with something like
      
          make test GIT_TEST_OE_SIZE=4
      
      which forces this code on all objects larger than 3 bytes.
      Signed-off-by: Duy Nguyen's avatarNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      ac77d0c3
    • Duy Nguyen's avatar
      pack-objects: clarify the use of object_entry::size · 27a7d067
      Duy Nguyen authored
      While this field most of the time contains the canonical object size,
      there is one case it does not: when we have found that the base object
      of the delta in question is also to be packed, we will very happily
      reuse the delta by copying it over instead of regenerating the new
      delta.
      
      "size" in this case will record the delta size, not canonical object
      size. Later on in write_reuse_object(), we reconstruct the delta
      header and "size" is used for this purpose. When this happens, the
      "type" field contains a delta type instead of a canonical type.
      Highlight this in the code since it could be tricky to see.
      Signed-off-by: Duy Nguyen's avatarNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      27a7d067
    • Duy Nguyen's avatar
      pack-objects: shrink z_delta_size field in struct object_entry · 0cb3c142
      Duy Nguyen authored
      We only cache deltas when it's smaller than a certain limit. This limit
      defaults to 1000 but save its compressed length in a 64-bit field.
      Shrink that field down to 20 bits, so you can only cache 1MB deltas.
      Larger deltas must be recomputed at when the pack is written down.
      Signed-off-by: Duy Nguyen's avatarNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      0cb3c142
    • Duy Nguyen's avatar
      pack-objects: refer to delta objects by index instead of pointer · 898eba5e
      Duy Nguyen authored
      These delta pointers always point to elements in the objects[] array
      in packing_data struct. We can only hold maximum 4G of those objects
      because the array size in nr_objects is uint32_t. We could use
      uint32_t indexes to address these elements instead of pointers. On
      64-bit architecture (8 bytes per pointer) this would save 4 bytes per
      pointer.
      
      Convert these delta pointers to indexes. Since we need to handle NULL
      pointers as well, the index is shifted by one [1].
      
      [1] This means we can only index 2^32-2 objects even though nr_objects
          could contain 2^32-1 objects. It should not be a problem in
          practice because when we grow objects[], nr_alloc would probably
          blow up long before nr_objects hits the wall.
      Signed-off-by: Duy Nguyen's avatarNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      898eba5e
    • Duy Nguyen's avatar
      pack-objects: move in_pack out of struct object_entry · 43fa44fa
      Duy Nguyen authored
      Instead of using 8 bytes (on 64 bit arch) to store a pointer to a
      pack. Use an index instead since the number of packs should be
      relatively small.
      
      This limits the number of packs we can handle to 1k. Since we can't be
      sure people can never run into the situation where they have more than
      1k pack files. Provide a fall back route for it.
      
      If we find out they have too many packs, the new in_pack_by_idx[]
      array (which has at most 1k elements) will not be used. Instead we
      allocate in_pack[] array that holds nr_objects elements. This is
      similar to how the optional in_pack_pos field is handled.
      
      The new simple test is just to make sure the too-many-packs code path
      is at least executed. The true test is running
      
          make test GIT_TEST_FULL_IN_PACK_ARRAY=1
      
      to take advantage of other special case tests.
      Signed-off-by: Duy Nguyen's avatarNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      43fa44fa
    • Duy Nguyen's avatar
      pack-objects: move in_pack_pos out of struct object_entry · 06af3bba
      Duy Nguyen authored
      This field is only need for pack-bitmap, which is an optional
      feature. Move it to a separate array that is only allocated when
      pack-bitmap is used (like objects[], it is not freed, since we need it
      until the end of the process)
      Signed-off-by: Duy Nguyen's avatarNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      06af3bba
    • Duy Nguyen's avatar
      pack-objects: use bitfield for object_entry::depth · b5c0cbd8
      Duy Nguyen authored
      Because of struct packing from now on we can only handle max depth
      4095 (or even lower when new booleans are added in this struct). This
      should be ok since long delta chain will cause significant slow down
      anyway.
      Signed-off-by: Duy Nguyen's avatarNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      b5c0cbd8
    • Duy Nguyen's avatar
    • Duy Nguyen's avatar
      pack-objects: turn type and in_pack_type to bitfields · fd9b1bae
      Duy Nguyen authored
      An extra field type_valid is added to carry the equivalent of OBJ_BAD
      in the original "type" field. in_pack_type always contains a valid
      type so we only need 3 bits for it.
      
      A note about accepting OBJ_NONE as "valid" type. The function
      read_object_list_from_stdin() can pass this value [1] and it
      eventually calls create_object_entry() where current code skip setting
      "type" field if the incoming type is zero. This does not have any bad
      side effects because "type" field should be memset()'d anyway.
      
      But since we also need to set type_valid now, skipping oe_set_type()
      leaves type_valid zero/false, which will make oe_type() return
      OBJ_BAD, not OBJ_NONE anymore. Apparently we do care about OBJ_NONE in
      prepare_pack(). This switch from OBJ_NONE to OBJ_BAD may trigger
      
          fatal: unable to get type of object ...
      
      Accepting OBJ_NONE [2] does sound wrong, but this is how it is has
      been for a very long time and I haven't time to dig in further.
      
      [1] See 5c49c116 (pack-objects: better check_object() performances -
          2007-04-16)
      
      [2] 21666f1a (convert object type handling from a string to a number
          - 2007-02-26)
      Signed-off-by: Duy Nguyen's avatarNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      fd9b1bae
    • Duy Nguyen's avatar
      pack-objects: a bit of document about struct object_entry · 8d6ccce1
      Duy Nguyen authored
      The role of this comment block becomes more important after we shuffle
      fields around to shrink this struct. It will be much harder to see what
      field is related to what.
      Signed-off-by: Duy Nguyen's avatarNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      8d6ccce1
  12. 28 Jan, 2017 1 commit
    • Jeff King's avatar
      pack-objects: enforce --depth limit in reused deltas · 7dbabbbe
      Jeff King authored
      Since 898b14ce (pack-objects: rework check_delta_limit usage,
      2007-04-16), we check the delta depth limit only when
      figuring out whether we should make a new delta. We don't
      consider it at all when reusing deltas, which means that
      packing once with --depth=250, and then again with
      --depth=50, the second pack may still contain chains larger
      than 50.
      
      This is generally considered a feature, as the results of
      earlier high-depth repacks are carried forward, used for
      serving fetches, etc. However, since we started using
      cross-pack deltas in c9af708b (pack-objects: use mru list
      when iterating over packs, 2016-08-11), we are no longer
      bounded by the length of an existing delta chain in a single
      pack.
      
      Here's one particular pathological case: a sequence of N
      packs, each with 2 objects, the base of which is stored as a
      delta in a previous pack. If we chain all the deltas
      together, we have a cycle of length N. We break the cycle,
      but the tip delta is still at depth N-1.
      
      This is less unlikely than it might sound. See the included
      test for a reconstruction based on real-world actions.  I
      ran into such a case in the wild, where a client was rapidly
      sending packs, and we had accumulated 10,000 before doing a
      server-side repack.  The pack that "git repack" tried to
      generate had a very deep chain, which caused pack-objects to
      run out of stack space in the recursive write_one().
      
      This patch bounds the length of delta chains in the output
      pack based on --depth, regardless of whether they are caused
      by cross-pack deltas or existed in the input packs. This
      fixes the problem, but does have two possible downsides:
      
        1. High-depth aggressive repacks followed by "normal"
           repacks will throw away the high-depth chains.
      
           In the long run this is probably OK; investigation
           showed that high-depth repacks aren't actually
           beneficial, and we dropped the aggressive depth default
           to match the normal case in 07e7dbf0 (gc: default
           aggressive depth to 50, 2016-08-11).
      
        2. If you really do want to store high-depth deltas on
           disk, they may be discarded and new delta computed when
           serving a fetch, unless you set pack.depth to match
           your high-depth size.
      
      The implementation uses the existing search for delta
      cycles.  That lets us compute the depth of any node based on
      the depth of its base, because we know the base is DFS_DONE
      by the time we look at it (modulo any cycles in the graph,
      but we know there cannot be any because we break them as we
      see them).
      
      There is some subtlety worth mentioning, though. We record
      the depth of each object as we compute it. It might seem
      like we could save the per-object storage space by just
      keeping track of the depth of our traversal (i.e., have
      break_delta_chains() report how deep it went). But we may
      visit an object through multiple delta paths, and on
      subsequent paths we want to know its depth immediately,
      without having to walk back down to its final base (doing so
      would make our graph walk quadratic rather than linear).
      
      Likewise, one could try to record the depth not from the
      base, but from our starting point (i.e., start
      recursion_depth at 0, and pass "recursion_depth + 1" to each
      invocation of break_delta_chains()). And then when
      recursion_depth gets too big, we know that we must cut the
      delta chain.  But that technique is wrong if we do not visit
      the nodes in topological order. In a chain A->B->C, it
      if we visit "C", then "B", then "A", we will never recurse
      deeper than 1 link (because we see at each node that we have
      already visited it).
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      7dbabbbe
  13. 11 Aug, 2016 1 commit
    • Jeff King's avatar
      pack-objects: break delta cycles before delta-search phase · 4cf2143e
      Jeff King authored
      We do not allow cycles in the delta graph of a pack (i.e., A
      is a delta of B which is a delta of A) for the obvious
      reason that you cannot actually access any of the objects in
      such a case.
      
      There's a last-ditch attempt to notice cycles during the
      write phase, during which we issue a warning to the user and
      write one of the objects out in full. However, this is
      "last-ditch" for two reasons:
      
        1. By this time, it's too late to find another delta for
           the object, so the resulting pack is larger than it
           otherwise could be.
      
        2. The warning is there because this is something that
           _shouldn't_ ever happen. If it does, then either:
      
             a. a pack we are reusing deltas from had its own
                cycle
      
             b. we are reusing deltas from multiple packs, and
                we found a cycle among them (i.e., A is a delta of
                B in one pack, but B is a delta of A in another,
                and we choose to use both deltas).
      
             c. there is a bug in the delta-search code
      
           So this code serves as a final check that none of these
           things has happened, warns the user, and prevents us
           from writing a bogus pack.
      
      Right now, (2b) should never happen because of the static
      ordering of packs in want_object_in_pack(). If two objects
      have a delta relationship, then they must be in the same
      pack, and therefore we will find them from that same pack.
      
      However, a future patch would like to change that static
      ordering, which will make (2b) a common occurrence. In
      preparation, we should be able to handle those kinds of
      cycles better. This patch does by introducing a
      cycle-breaking step during the get_object_details() phase,
      when we are deciding which deltas can be reused. That gives
      us the chance to feed the objects into the delta search as
      if the cycle did not exist.
      
      We'll leave the detection and warning in the write_object()
      phase in place, as it still serves as a check for case (2c).
      
      This does mean we will stop warning for (2a). That case is
      caused by bogus input packs, and we ideally would warn the
      user about it.  However, since those cycles show up after
      picking reusable deltas, they look the same as (2b) to us;
      our new code will break the cycles early and the last-ditch
      check will never see them.
      
      We could do analysis on any cycles that we find to
      distinguish the two cases (i.e., it is a bogus pack if and
      only if every delta in the cycle is in the same pack), but
      we don't need to. If there is a cycle inside a pack, we'll
      run into problems not only reusing the delta, but accessing
      the object data at all. So when we try to dig up the actual
      size of the object, we'll hit that same cycle and kick in
      our usual complain-and-try-another-source code.
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      4cf2143e
  14. 30 Dec, 2013 1 commit
    • Vicent Marti's avatar
      pack-objects: implement bitmap writing · 7cc8f971
      Vicent Marti authored
      This commit extends more the functionality of `pack-objects` by allowing
      it to write out a `.bitmap` index next to any written packs, together
      with the `.idx` index that currently gets written.
      
      If bitmap writing is enabled for a given repository (either by calling
      `pack-objects` with the `--write-bitmap-index` flag or by having
      `pack.writebitmaps` set to `true` in the config) and pack-objects is
      writing a packfile that would normally be indexed (i.e. not piping to
      stdout), we will attempt to write the corresponding bitmap index for the
      packfile.
      
      Bitmap index writing happens after the packfile and its index has been
      successfully written to disk (`finish_tmp_packfile`). The process is
      performed in several steps:
      
          1. `bitmap_writer_set_checksum`: this call stores the partial
             checksum for the packfile being written; the checksum will be
             written in the resulting bitmap index to verify its integrity
      
          2. `bitmap_writer_build_type_index`: this call uses the array of
             `struct object_entry` that has just been sorted when writing out
             the actual packfile index to disk to generate 4 type-index bitmaps
             (one for each object type).
      
             These bitmaps have their nth bit set if the given object is of
             the bitmap's type. E.g. the nth bit of the Commits bitmap will be
             1 if the nth object in the packfile index is a commit.
      
             This is a very cheap operation because the bitmap writing code has
             access to the metadata stored in the `struct object_entry` array,
             and hence the real type for each object in the packfile.
      
          3. `bitmap_writer_reuse_bitmaps`: if there exists an existing bitmap
             index for one of the packfiles we're trying to repack, this call
             will efficiently rebuild the existing bitmaps so they can be
             reused on the new index. All the existing bitmaps will be stored
             in a `reuse` hash table, and the commit selection phase will
             prioritize these when selecting, as they can be written directly
             to the new index without having to perform a revision walk to
             fill the bitmap. This can greatly speed up the repack of a
             repository that already has bitmaps.
      
          4. `bitmap_writer_select_commits`: if bitmap writing is enabled for
             a given `pack-objects` run, the sequence of commits generated
             during the Counting Objects phase will be stored in an array.
      
             We then use that array to build up the list of selected commits.
             Writing a bitmap in the index for each object in the repository
             would be cost-prohibitive, so we use a simple heuristic to pick
             the commits that will be indexed with bitmaps.
      
             The current heuristics are a simplified version of JGit's
             original implementation. We select a higher density of commits
             depending on their age: the 100 most recent commits are always
             selected, after that we pick 1 commit of each 100, and the gap
             increases as the commits grow older. On top of that, we make sure
             that every single branch that has not been merged (all the tips
             that would be required from a clone) gets their own bitmap, and
             when selecting commits between a gap, we tend to prioritize the
             commit with the most parents.
      
             Do note that there is no right/wrong way to perform commit
             selection; different selection algorithms will result in
             different commits being selected, but there's no such thing as
             "missing a commit". The bitmap walker algorithm implemented in
             `prepare_bitmap_walk` is able to adapt to missing bitmaps by
             performing manual walks that complete the bitmap: the ideal
             selection algorithm, however, would select the commits that are
             more likely to be used as roots for a walk in the future (e.g.
             the tips of each branch, and so on) to ensure a bitmap for them
             is always available.
      
          5. `bitmap_writer_build`: this is the computationally expensive part
             of bitmap generation. Based on the list of commits that were
             selected in the previous step, we perform several incremental
             walks to generate the bitmap for each commit.
      
             The walks begin from the oldest commit, and are built up
             incrementally for each branch. E.g. consider this dag where A, B,
             C, D, E, F are the selected commits, and a, b, c, e are a chunk
             of simplified history that will not receive bitmaps.
      
                  A---a---B--b--C--c--D
                           \
                            E--e--F
      
             We start by building the bitmap for A, using A as the root for a
             revision walk and marking all the objects that are reachable
             until the walk is over. Once this bitmap is stored, we reuse the
             bitmap walker to perform the walk for B, assuming that once we
             reach A again, the walk will be terminated because A has already
             been SEEN on the previous walk.
      
             This process is repeated for C, and D, but when we try to
             generate the bitmaps for E, we can reuse neither the current walk
             nor the bitmap we have generated so far.
      
             What we do now is resetting both the walk and clearing the
             bitmap, and performing the walk from scratch using E as the
             origin. This new walk, however, does not need to be completed.
             Once we hit B, we can lookup the bitmap we have already stored
             for that commit and OR it with the existing bitmap we've composed
             so far, allowing us to limit the walk early.
      
             After all the bitmaps have been generated, another iteration
             through the list of commits is performed to find the best XOR
             offsets for compression before writing them to disk. Because of
             the incremental nature of these bitmaps, XORing one of them with
             its predecesor results in a minimal "bitmap delta" most of the
             time. We can write this delta to the on-disk bitmap index, and
             then re-compose the original bitmaps by XORing them again when
             loaded.
      
             This is a phase very similar to pack-object's `find_delta` (using
             bitmaps instead of objects, of course), except the heuristics
             have been greatly simplified: we only check the 10 bitmaps before
             any given one to find best compressing one. This gives good
             results in practice, because there is locality in the ordering of
             the objects (and therefore bitmaps) in the packfile.
      
           6. `bitmap_writer_finish`: the last step in the process is
      	serializing to disk all the bitmap data that has been generated
      	in the two previous steps.
      
      	The bitmap is written to a tmp file and then moved atomically to
      	its final destination, using the same process as
      	`pack-write.c:write_idx_file`.
      Signed-off-by: 's avatarVicent Marti <tanoku@gmail.com>
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      7cc8f971
  15. 24 Oct, 2013 2 commits
    • Vicent Marti's avatar
      pack-objects: factor out name_hash · 68fb36eb
      Vicent Marti authored
      As the pack-objects system grows beyond the single
      pack-objects.c file, more parts (like the soon-to-exist
      bitmap code) will need to compute hashes for matching
      deltas. Factor out name_hash to make it available to other
      files.
      Signed-off-by: 's avatarVicent Marti <tanoku@gmail.com>
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      68fb36eb
    • Vicent Marti's avatar
      pack-objects: refactor the packing list · 2834bc27
      Vicent Marti authored
      The hash table that stores the packing list for a given `pack-objects`
      run was tightly coupled to the pack-objects code.
      
      In this commit, we refactor the hash table and the underlying storage
      array into a `packing_data` struct. The functionality for accessing and
      adding entries to the packing list is hence accessible from other parts
      of Git besides the `pack-objects` builtin.
      
      This refactoring is a requirement for further patches in this series
      that will require accessing the commit packing list from outside of
      `pack-objects`.
      
      The hash table implementation has been minimally altered: we now
      use table sizes which are always a power of two, to ensure a uniform
      index distribution in the array.
      Signed-off-by: 's avatarVicent Marti <tanoku@gmail.com>
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      2834bc27