Skip to content

Add pack-refs housekeeping task support to the transaction manager

This MR adds support for the pack-refs housekeeping task to the transaction manager. The caller calls (*Transactionmanager).PackRefs() to signify a pack-refs task. The task spans different states of the transaction:

  • When the transaction is committed, the manager runs git-pack-refs command against the snapshot repository. Because git-pack-refs command doesn't output the list of pruned refs, we need to walk twice to collect the list of pruned loose refs: one before and one after running git-pack-refs. This can be improved by git#222. The resulting packed-refs file are collected and attached to the transaction's staging directory.
  • When the transaction is admitted, the manager verifies whether the result of git-pack-refs conflicts with the current repository state. If everything is fine, the manager appends a log entry with the corresponding housekeeping sub-entry. The transaction's staging directory is then moved to the WAL's entry location.
  • When the log entry is applied, the manager copies the packed-refs file to the repository via hard-linking and removes all associated loose references.

When a housekeeping task is performed, there's a possibility that other concurrent transactions commit before the housekeeping task commits. We do our best to merge the results and avoid conflicts. Thus, the transaction manager needs to keep appended log entries around. Before accepting a transaction, the transaction manager scans between a transaction's snapshot LSN and the latest appended log entry for conflict. Keeparound log entries are structured as a linked list. All entries at the top are removed when there are no further transactions refer to their snapshot repositories.

There is no automatic conflict resolution. Each housekeeping task needs to resolve the conflict case by case. Pack-refs task is compatible with ref creation and ref updates so that new ref values are kept as loose references. The transaction manager can freely replace packed-refs file. In contrast, it won't be compatible with other housekeeping tasks and ref deletions.

Ref deletion is special. When a ref is deleted, it removes the loose ref file and the entry inside packed-refs file. There is no tombstone to shadow the entry in the packed-refs file. In theory, we can modify the packed-refs file to remove the entries. However, that approach requires parsing and rewriting the files. We can do that in a later iteration or wait for the reftables backend to land. The transaction manager settles down with raising a conflict error in this case.

This MR is in the WAL epic, so it does not take broadcasting changes into account. So, it uses hard-linking to move the packed-refs file around. When transmitting data to other nodes, the packed-refs file could be too big as an attachment. In that case, we can calculate the diff and reconstruct the packed-refs file in the receivers. But let's not dig too deep into that at this stage.

Edited by Quang-Minh Nguyen

Merge request reports