Fix a race in transaction manager log entry retention (!6791) · Merge requests · GitLab.org / Gitaly

Closes #5814 (closed)

TransactionManager has an issue where unneeded log entries may not be pruned if a transaction is aborted. This is because of how current log entry retention works. We keep some log entries that are still referred to by other transactions. Each log entry has a reference counter. When a transaction finishes, it reduces the target entry's counter and signals the manager to wake up to clean up stale entries via an unbuffered channel. There are two problems here:

All transactions send a signal to this channel even if that transaction does not necessarily clean up any log entry. The list of keeparound entries is structured as a linked list. It removes leading empty entries only.
If the transaction signals the channel when the manager is not listening, the signal is ignored.

The two problems could lead to the accumulation of unneeded log entries on the disk. Consider the following scenario:

Begin TX1
Begin TX2
Commit TX1 - The channel is signaled
The manager begins cleaning up TX1
Rollback TX2 - The channel is not listened by the manager. The signal is ignored.
The manager finishes cleaning up and starts processing new transactions. It is not aware that the log entry TX2 is based on should be removed.

Although the log entry will be cleaned up after Gitaly restarts, the log entries might accumulate for days. To resolve this race, this commit proposes two changes:

Convert the signaling queue to a buffered channel
Only signal the channel if the transaction modifies the head of the in-memory list.

As a result, the above scenario becomes:

Begin TX1
Begin TX2
Commit TX1 - The channel is signaled.
The manager begins cleaning up TX1. The channel is cleaned up.
Rollback TX2 - The channel is signaled.
The manager finishes cleaning up TX1.
It receives the last message from the buffered channel and starts to clean up TX2.

If another transaction (TX3 for example) signals to that channel when it's full, it means the manager still has not cleaned up. That's perfectly fine to let TX3 move on. The next time the manager wakes up, it scans the head of the list, so it will clean up all empty log entries that were referred by TX1, TX2, and TX3.

If the manager crashes in between, the unneeded log entries are still removed when it wakes up.

Because the bug is a race, it's not possible to write a reliable test to cover that case.

Edited Apr 02, 2024 by Quang-Minh Nguyen

Fix a race in transaction manager log entry retention

Merge request reports