CreateRepository cannot use atomic two-phase voting
The CreateRepository RPC will call git-init(1) on the repository's path, regardless of whether the repository exists already or doesn't exist. As a result, this call is idempotent and can even be called if the repository exists already. This raises the question on how to do transactional voting on this RPC call with proper locking semantics given that the state of the repository can be anything.
Ideally, all nodes do the same change if and only if they all agree on the same state. The sequence should thus be:
- Record the state of all files we're about to change and create copies on which we can perform the changes.
- Perform our modifications.
- Lock the target files for update.
- Cast a vote on the changes, which is the current state of all of our locked files.
- If we fail to reach quorum, we must discard all changes and not perform any updates.
- If all nodes agree, we commit the changes to disk.
- We do another vote to assert that all nodes have performed the change.
This is really hard to implement for CreateRepository()
given that git-init(1) would touch various different files.
If CreateRepository()
would raise an error if the repo exists already, then this is trivial to implement:
- Assert that the repo does not exist. If it does, return an error.
- Create a temporary directory and call git-init(1) on this directory.
- Lock the target directory (which still mustn't exist, so we'd just create a .lock file in its place to avoid concurrent creation).
- Hash the temporary directory's contents and use it as the vote.
- If we fail to reach quorum, we can just discard the temporary directory.
- If all nodes agree, move the directory into place.
- Do another vote to assert we have committed the change.
This would be trivial to implemented and is guaranteed to be atomic. This raises the question whether we want to change semantics of this RPC call to not be idempotent anymore, or if there are other alternatives we can consider.