Skip to content
  • Jeff King's avatar
    gc: run pre-detach operations under lock · c45af94d
    Jeff King authored and Junio C Hamano's avatar Junio C Hamano committed
    We normally try to avoid having two auto-gc operations run
    at the same time, because it wastes resources. This was done
    long ago in 64a99eb4 (gc: reject if another gc is running,
    unless --force is given, 2013-08-08).
    
    When we do a detached auto-gc, we run the ref-related
    commands _before_ detaching, to avoid confusing lock
    contention. This was done by 62aad184
    
     (gc --auto: do not
    lock refs in the background, 2014-05-25).
    
    These two features do not interact well. The pre-detach
    operations are run before we check the gc.pid lock, meaning
    that on a busy repository we may run many of them
    concurrently. Ideally we'd take the lock before spawning any
    operations, and hold it for the duration of the program.
    
    This is tricky, though, with the way the pid-file interacts
    with the daemonize() process.  Other processes will check
    that the pid recorded in the pid-file still exists. But
    detaching causes us to fork and continue running under a
    new pid. So if we take the lock before detaching, the
    pid-file will have a bogus pid in it. We'd have to go back
    and update it with the new pid after detaching. We'd also
    have to play some tricks with the tempfile subsystem to
    tweak the "owner" field, so that the parent process does not
    clean it up on exit, but the child process does.
    
    Instead, we can do something a bit simpler: take the lock
    only for the duration of the pre-detach work, then detach,
    then take it again for the post-detach work. Technically,
    this means that the post-detach lock could lose to another
    process doing pre-detach work. But in the long run this
    works out.
    
    That second process would then follow-up by doing
    post-detach work. Unless it was in turn blocked by a third
    process doing pre-detach work, and so on. This could in
    theory go on indefinitely, as the pre-detach work does not
    repack, and so need_to_gc() will continue to trigger.  But
    in each round we are racing between the pre- and post-detach
    locks. Eventually, one of the post-detach locks will win the
    race and complete the full gc. So in the worst case, we may
    racily repeat the pre-detach work, but we would never do so
    simultaneously (it would happen via a sequence of serialized
    race-wins).
    
    Signed-off-by: default avatarJeff King <peff@peff.net>
    Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    c45af94d