Skip to content

2PC via pre-receive hook

As a result of the experiments in #2466 (closed) and #2529 (closed), we have concluded that the most promising way to implement strong consistency for reference updates is by going via Git hooks: given a reference update, a hook will execute on each Gitaly node that reports back to Praefect. Praefect will collect these reports from all Gitaly nodes that take part in the current update and, if all nodes post the same update, send them a message to go ahead.

While the mid-term goal is to hook directly into the reference transaction handling code in order to handle all kinds of reference updates and not only those invoked via git-receive-pack(1), this requires a new set of hooks on Git's side. We thus decided to implement a first POC of this mechanism by using the Git pre-receive hook, which executes after all reference updates have been announced by the Git client. This should give us a better picture of how the mechanism will work in the end.

The following diagram depicts the 3PC via a pre-receive hook:

sequenceDiagram
  Praefect->>+Gitaly: ReceivePack
  Gitaly->>+Git: git receive-pack
  Git->>+Hook: update HEAD master
  Hook->>+Gitaly: TX: update HEAD master
  Gitaly->>+Praefect: TX: update HEAD master
  Praefect->>+Praefect: TX: collect votes
  Praefect->>+Gitaly: TX: commit
  Gitaly->>+Hook: TX: commit
  Hook->>+Git: exit 0
  Git->>+Gitaly: exit 0
  Gitaly->>+Praefect: success

The 2PC voting protocol will start as soon as a first "TX" message is received on the Praefect node. Each of the pre-receive hooks will block until it receives a message from Praefect telling to to either go on with the update or to abort. In case the vote was successful, the hook will exit with 0 to indicate success, otherwise it will return an error code and thus abort the reference update.

The main goal of this issue is to establish a communication channel between hook and Praefect to allow for transaction handling. The communication channel should be implemented transparent to Gitaly nodes as much as possible so that Gitaly does not need to know if and how many transactions a given Git transaction is going to start. This ensures we can swap out the hooks in the future and start multiple transactions for a single Git command.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information