Skip to content
  • Jeff King's avatar
    pack-objects: break delta cycles before delta-search phase · 4cf2143e
    Jeff King authored and Junio C Hamano's avatar Junio C Hamano committed
    
    
    We do not allow cycles in the delta graph of a pack (i.e., A
    is a delta of B which is a delta of A) for the obvious
    reason that you cannot actually access any of the objects in
    such a case.
    
    There's a last-ditch attempt to notice cycles during the
    write phase, during which we issue a warning to the user and
    write one of the objects out in full. However, this is
    "last-ditch" for two reasons:
    
      1. By this time, it's too late to find another delta for
         the object, so the resulting pack is larger than it
         otherwise could be.
    
      2. The warning is there because this is something that
         _shouldn't_ ever happen. If it does, then either:
    
           a. a pack we are reusing deltas from had its own
              cycle
    
           b. we are reusing deltas from multiple packs, and
              we found a cycle among them (i.e., A is a delta of
              B in one pack, but B is a delta of A in another,
              and we choose to use both deltas).
    
           c. there is a bug in the delta-search code
    
         So this code serves as a final check that none of these
         things has happened, warns the user, and prevents us
         from writing a bogus pack.
    
    Right now, (2b) should never happen because of the static
    ordering of packs in want_object_in_pack(). If two objects
    have a delta relationship, then they must be in the same
    pack, and therefore we will find them from that same pack.
    
    However, a future patch would like to change that static
    ordering, which will make (2b) a common occurrence. In
    preparation, we should be able to handle those kinds of
    cycles better. This patch does by introducing a
    cycle-breaking step during the get_object_details() phase,
    when we are deciding which deltas can be reused. That gives
    us the chance to feed the objects into the delta search as
    if the cycle did not exist.
    
    We'll leave the detection and warning in the write_object()
    phase in place, as it still serves as a check for case (2c).
    
    This does mean we will stop warning for (2a). That case is
    caused by bogus input packs, and we ideally would warn the
    user about it.  However, since those cycles show up after
    picking reusable deltas, they look the same as (2b) to us;
    our new code will break the cycles early and the last-ditch
    check will never see them.
    
    We could do analysis on any cycles that we find to
    distinguish the two cases (i.e., it is a bogus pack if and
    only if every delta in the cycle is in the same pack), but
    we don't need to. If there is a cycle inside a pack, we'll
    run into problems not only reusing the delta, but accessing
    the object data at all. So when we try to dig up the actual
    size of the object, we'll hit that same cycle and kick in
    our usual complain-and-try-another-source code.
    
    Signed-off-by: default avatarJeff King <peff@peff.net>
    Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    4cf2143e