• Duy Nguyen's avatar
    clone: open a shortcut for connectivity check · c6807a40
    Duy Nguyen authored
    In order to make sure the cloned repository is good, we run "rev-list
    --objects --not --all $new_refs" on the repository. This is expensive
    on large repositories. This patch attempts to mitigate the impact in
    this special case.
    
    In the "good" clone case, we only have one pack. If all of the
    following are met, we can be sure that all objects reachable from the
    new refs exist, which is the intention of running "rev-list ...":
    
     - all refs point to an object in the pack
     - there are no dangling pointers in any object in the pack
     - no objects in the pack point to objects outside the pack
    
    The second and third checks can be done with the help of index-pack as
    a slight variation of --strict check (which introduces a new condition
    for the shortcut: pack transfer must be used and the number of objects
    large enough to call index-pack). The first is checked in
    check_everything_connected after we get an "ok" from index-pack.
    
    "index-pack + new checks" is still faster than the current "index-pack
    + rev-list", which is the whole point of this patch. If any of the
    conditions fail, we fall back to the good old but expensive "rev-list
    ..". In that case it's even more expensive because we have to pay for
    the new checks in index-pack. But that should only happen when the
    other side is either buggy or malicious.
    
    Cloning linux-2.6 over file://
    
            before         after
    real    3m25.693s      2m53.050s
    user    5m2.037s       4m42.396s
    sys     0m13.750s      0m16.574s
    
    A more realistic test with ssh:// over wireless
    
            before         after
    real    11m26.629s     10m4.213s
    user    5m43.196s      5m19.444s
    sys     0m35.812s      0m37.630s
    
    This shortcut is not applied to shallow clones, partly because shallow
    clones should have no more objects than a usual fetch and the cost of
    rev-list is acceptable, partly to avoid dealing with corner cases when
    grafting is involved.
    
    This shortcut does not apply to unpack-objects code path either
    because the number of objects must be small in order to trigger that
    code path.
    Signed-off-by: Duy Nguyen's avatarNguyễn Thái Ngọc Duy <[email protected]>
    Signed-off-by: default avatarJunio C Hamano <[email protected]>
    c6807a40
connected.h 818 Bytes