Skip to content
  • Derrick Stolee's avatar
    list-objects: consume sparse tree walk · 4f6d26b1
    Derrick Stolee authored and Junio C Hamano's avatar Junio C Hamano committed
    
    
    When creating a pack-file using 'git pack-objects --revs' we provide
    a list of interesting and uninteresting commits. For example, a push
    operation would make the local topic branch be interesting and the
    known remote refs as uninteresting. We want to discover the set of
    new objects to send to the server as a thin pack.
    
    We walk these commits until we discover a frontier of commits such
    that every commit walk starting at interesting commits ends in a root
    commit or unintersting commit. We then need to discover which
    non-commit objects are reachable from  uninteresting commits. This
    commit walk is not changing during this series.
    
    The mark_edges_uninteresting() method in list-objects.c iterates on
    the commit list and does the following:
    
    * If the commit is UNINTERSTING, then mark its root tree and every
      object it can reach as UNINTERESTING.
    
    * If the commit is interesting, then mark the root tree of every
      UNINTERSTING parent (and all objects that tree can reach) as
      UNINTERSTING.
    
    At the very end, we repeat the process on every commit directly
    given to the revision walk from stdin. This helps ensure we properly
    cover shallow commits that otherwise were not included in the
    frontier.
    
    The logic to recursively follow trees is in the
    mark_tree_uninteresting() method in revision.c. The algorithm avoids
    duplicate work by not recursing into trees that are already marked
    UNINTERSTING.
    
    Add a new 'sparse' option to the mark_edges_uninteresting() method
    that performs this logic in a slightly different way. As we iterate
    over the commits, we add all of the root trees to an oidset. Then,
    call mark_trees_uninteresting_sparse() on that oidset. Note that we
    include interesting trees in this process. The current implementation
    of mark_trees_unintersting_sparse() will walk the same trees as
    the old logic, but this will be replaced in a later change.
    
    Add a '--sparse' flag in 'git pack-objects' to call this new logic.
    Add a new test script t/t5322-pack-objects-sparse.sh that tests this
    option. The tests currently demonstrate that the resulting object
    list is the same as the old algorithm. This includes a case where
    both algorithms pack an object that is not needed by a remote due to
    limits on the explored set of trees. When the sparse algorithm is
    changed in a later commit, we will add a test that demonstrates a
    change of behavior in some cases.
    
    Signed-off-by: default avatarDerrick Stolee <dstolee@microsoft.com>
    Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    4f6d26b1