Skip to content

Make git-upload-pack use gitaly-hooks for pack-objects

Jacob Vosmaer requested to merge jv-pack-objects-hook into master

Part of gitlab-com/gl-infra/scalability#807 (closed) and gitlab-com/gl-infra&372 (closed).

Feature flag: upload_pack_gitaly_hooks

Log entries in gitaly_hooks.log look like:

time="2021-01-22T14:26:57+01:00" level=info msg="local git command" args="[pack-objects --revs --thin --stdout --progress --delta-base-offset]"

The PostUploadPack and SSHUploadPack RPC's run git-upload-pack on the Gitaly server. Normally, git-upload-pack then spawns a git-pack-objects process which contains the packfile data that will be in the response:

sequenceDiagram
    participant A as Gitaly (PostUploadPack)
    participant B as git-upload-pack
    participant C as git-pack-objects
    A->>B:fetch request
    B->>C:pack request
    C->>B:packfile data
    B->>A:fetch response

Luckily for us, Git has a configuration option uploadpack.packobjectshook that lets us replace git-pack-objects with a custom executable. This is a key part of the cache we are building. In this MR, we do the necessary ground work to have git-upload-pack spawn gitaly-hooks instead of git-pack-objects. Inside gitaly-hooks we then run git-pack-objects as before; caching will follow in a later MR.

sequenceDiagram
participant A as Gitaly (PostUploadPack)
participant B as git-upload-pack
participant C as gitaly-hooks
participant D as git-pack-objects
    A->>B:fetch request
    B->>C:pack request
    C->>D:pack request
    D->>C:packfile data
    C->>B:packfile data
    B->>A:fetch response

There is a bug in Git that happens when you use a pack-objects hook and partial clone at the same time: git#82 (closed). In this MR we have some workaround code that handles this problem. We have submitted a fix for this bug to the Git mailing list but that will take a while and luckily we can work around it "outside" Git.

Edited by Jacob Vosmaer

Merge request reports