Generalize transaction conflict checking
The TransactionManager
's conflict checks are currently inconsistent:
- Some operations actually verify conflicts. For example, we check that two transactions don't concurrently create a repository. That would violate serializability, as the other transaction should fail with an already exists error.
- Some operations only verify consistency. For example, reference updates are only checked for consistency. The repository could be deleted, and recreated by concurrent transactions and we would still apply the reference updates if they apply on a logical level, ie. old OID is what is expected and the objects needed by the new OID exist in the repository.
Validating the changes of transactions is unnecessary. We can assume that all of the transactions work correctly in isolation do not stage invalid state. That leaves only conflict checking.
We lack conflict checking in some/many aspects and can allow invalid transactions through. For example (Implement missing serializability checks for al... (#5862)), we allow a concurrent transaction to remove a repository that another transaction is linking as an alternate to another repository. This can lead to repository corruption.
With the introduction of the physical logging protocol (#5793 (closed)), we're recording the file writes that are being performed. This allows us to generalize the conflict checking to ensure it covers everything. In short:
- If two transactions read a file, there is no conflict.
- If one transaction reads a file and another transaction modifies a file, there is a conflict.
- If two transactions modify a file, there is a conflict.
The above checks are enough to guarantee serializability.
The physical logging protocol records all of the writes, so checking for write conflicts is trivial. Creating or deleting a file acquires an exclusive lock on the target.
We're not recording the reads, so checking for read/write conflicts is more involved. While we generally don't have access to the reads that Git does, we can synthesize some read locks on files from the writes.
- Acquire shared read locks on all parent components of a path. For example, writing
@hashed/xx/yy/repo.git/config
will acquire shared locks on@hashed
,@hashed/xx
,@hashed/xx/yy
,@hashed/xx/yy/repo.git
and an exclusive lock on@hashed/xx/yy/repo.git/config
. - Writing
objects/info/alternates
acquires shared lock on the alternate'sobjects/info/alternates
. This prevents concurrent modifications to the target's alternates after the transaction has checked it.
File-level locks are general enough to cover everything, and provide a good base to build on. Some conflicts, such as file writes that logically depend on other files, are better handled logically. For example, preventing concurrently referencing an object and pruning it could be prevented by acquiring a shared lock on the packfile where the object is. This is problematic though as we don't record the reads Git does and thus don't automatically know where the object was. Doing this on the file level would also prevent repacking from being performed concurrently with any object reads even though the operations don't conflict. As such, we'd be better conflict checking operations involving object on a logical level.
Writing objects is idempotent, so there's no need to check object writes for conflict, and the same goes for repacking. Object pruning however may remove an object that is being referenced concurrently. We only need to conflict check object pruning. Objects may be removed from the repository through pruning or the repository being disconnected from alternate. Checking for conflicts is easy by checking the objects that the concurrent transactions depended on still exist in the repository after applying the pruning operations.
The current ref backend can be conflict checked using file-level locks. Reference packing would remove loose references which creates file-level conflicts with creations and updates. These could be handled by dropping the removal operations if the files were concurrently updated.
(The eventual reftable backend will require logical conflict checking of reference updates. This will be quite trivial as all that needs to be done is to ensure the old value is still the same after all concurrent transactions.)
So that leaves us with:
- File-level conflict checks as the default
- Exclusive locks on targets of writes
- Shared locks on the parent directories
- Read locks on some files we know were read during an operation, such as
objects/info/alternates
- Logical checks to ease some conflicts that the file-level checks would surface:
- Object pruning conflict checked logically
- Reference repacking should support dropping conflicting loose reference file deletions.
The above generalizes and simplifies the conflict checks. The general approach better ensures we don't miss necessary conflict checks. On top of correctness, we can move the transactions to be fully staged before the TransactionManager
's critical section and improve performance as we now have proper conflict checks in place to notice conflicting changes staged by the transactions. With the conflict checks generalized, we can conflict check arbitrary transactions which gives us more flexibility in staging transactions.