Add WAL transaction to the housekeeping manager
This MR integrates WAL transactioning system with the housekeeping manager. The manager now attempts to extract a transaction from the provided context. If a transaction is found, the housekeeping manager calls diverged housekeeping implementation in the transaction manager.
For every RPC request reaching Gitaly, a middleware initiates a transaction and injects it into the context of the request. As a result, the housekeeping manager can pick the transaction up from housekeeping invocations in RPC handlers.
Apart from that, Gitaly starts a cron-like housekeeping scheduler. This scheduler walks through the list of repositories and triggers housekeeping randomly. Thus, this MR also adds transaction initialization to the aforementioned scheduler.
The housekeeping manager skips some housekeeping tasks. Those tasks are deemed unnecessary after the transaction is enabled. Two significant tasks are:
- Pruning objects. The traditional housekeeping manager prunes all objects exceeding a grace period. This period is to ensure an object is not removed while it's used by another request. WAL handles parallel requests in a very different way and unreachable objects are not included. Eventually, they'll be removed by a full repack. Thus, no need to call pruning independently.
- Clean stale data. A WAL transaction copies meaningful data over when it is applied to the destination repository. All temporary files, dirs, locks, etc. are removed. Thus, there should be no stale data.
Manual tests
I performed multiple manual testing scenarios on GDK with Transaction enabled.
What's not in this MR?
-
✅ Full repack timestamp: there is a pending MR for it Write full repack timestamp file after applying... (!6708 - merged). That timestamp is used to determine the next full repack point of time. -
⌛ Logs and metrics are half-supported. It tells if the task is successful or not, but it doesn't tell the precise time. It will be addressed at Add housekeeping metrics to transaction manager (!6771 - merged). -
⌛ FetchIntoObjectPool
RPC fetches new objects from the origin repository into the object pool. Afterward, it triggers a full set of housekeeping tasks. In some prior commits, we enabled WAL transaction support in the housekeeping manager. The manager initiates the WAL transaction and commits it after done. So, it means FetchIntoObjectPool results in two non-nested transactions. This will be fixed in Transactionize FetchIntoObjectPool RPC manually (!6775 - merged)