Skip to content
  • Jeff King's avatar
    upload-pack: provide a hook for running pack-objects · 20b20a22
    Jeff King authored and Junio C Hamano's avatar Junio C Hamano committed
    
    
    When upload-pack serves a client request, it turns to
    pack-objects to do the heavy lifting of creating a
    packfile. There's no easy way to intercept the call to
    pack-objects, but there are a few good reasons to want to do
    so:
    
      1. If you're debugging a client or server issue with
         fetching, you may want to store a copy of the generated
         packfile.
    
      2. If you're gathering data from real-world fetches for
         performance analysis or debugging, storing a copy of
         the arguments and stdin lets you replay the pack
         generation at your leisure.
    
      3. You may want to insert a caching layer around
         pack-objects; it is the most CPU- and memory-intensive
         part of serving a fetch, and its output is a pure
         function[1] of its input, making it an ideal place to
         consolidate identical requests.
    
    This patch adds a simple "hook" interface to intercept calls
    to pack-objects. The new test demonstrates how it can be
    used for debugging (using it for caching is a
    straightforward extension; the tricky part is writing the
    actual caching layer).
    
    This hook is unlike the normal hook scripts found in the
    "hooks/" directory of a repository. Because we promise that
    upload-pack is safe to run in an untrusted repository, we
    cannot execute arbitrary code or commands found in the
    repository (neither in hooks/, nor in the config). So
    instead, this hook is triggered from a config variable that
    is explicitly ignored in the per-repo config.
    
    The config variable holds the actual shell command to run as
    the hook.  Another approach would be to simply treat it as a
    boolean: "should I respect the upload-pack hooks in this
    repo?", and then run the script from "hooks/" as we usually
    do. However, that isn't as flexible; there's no way to run a
    hook approved by the site administrator (e.g., in
    "/etc/gitconfig") on a repository whose contents are not
    trusted. The approach taken by this patch is more
    fine-grained, if a little less conventional for git hooks
    (it does behave similar to other configured commands like
    diff.external, etc).
    
    [1] Pack-objects isn't _actually_ a pure function. Its
        output depends on the exact packing of the object
        database, and if multi-threading is used for delta
        compression, can even differ racily. But for the
        purposes of caching, that's OK; of the many possible
        outputs for a given input, it is sufficient only that we
        output one of them.
    
    Signed-off-by: default avatarJeff King <peff@peff.net>
    Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    20b20a22