Pack references in a transaction
The current housekeeping approach is working on the repository concurrently with the TransactionManager
. This is not okay as the TransactionManager is expected to be the single writer into the repository. We'll thus need a different method for packing references of a repository to synchronize it with all other access.
We should move the reference packing to happen in a transaction. The reference packing can then happen in isolation with its own snapshot. The resulting changes can be committed through the TransactionManager for logging and applying. As we are packing the references in a snapshot, we have to check for conflicts at commit time. Having multiple reference packing operations in flight is wasteful, so there should only ever be one packing operation per repository in-flight.
Reference packing deletes the loose reference and moves them into the packed-refs
file. This then gives us two types of changes we need to log:
- The new
packed-refs
file - The loose references to delete
Logging those changes, and applying them to the repository from the log would mean the main repository would be in the same state as the snapshot was. This is ignoring any concurrent transactions.
Before committing the log entry, we have to check for conflicting concurrent reference changes:
- Reference creations are not problematic. They wouldn't be recorded in the transaction for deletion and are not included in the pack.
- Reference updates conflict with the loose reference deletion. The new updated reference would always be stored as a loose reference, so the conflict can be handled by dropping the loose reference deletion from the log entry. The pack may contain an old value of the reference but the new loose reference shadows it.
- Reference deletions are problematic. The
packed-refs
would contain an old value of a deleted reference. As there is no loose reference to shadow it, the reference would come back with the old value in thepacked-refs
file. The transaction must be aborted and the retried.
The above should allow for packing references concurrently in a transaction and handling conflicts which may arise.
The TransactionManager should track the number of loose reference in the repository and trigger reference packing as needed.