internal/gitaly/storage/storagemgr/transaction_manager.go · 3d4eb30e6bb7c3563e00239d6edcf8c41b4a0618 · GitLab.org / gitaly

Fix a race in transaction manager log entry retention · 0c6a1788

Quang-Minh Nguyen authored Apr 01, 2024

TransactionManager has an issue where unneeded log entries may not be
pruned if a transaction is aborted. This is because of how current log
entry retention works. We keep some log entries that are still referred
by other transactions. Each log entry has a reference counter. When a
transaction finishes, it reduces the target entry's counter and signals
the manager to wake up to clean up stale entries via an unbuffered
channel. There are two problems here:
* All transactions send a signal to [this channel](https://gitlab.com/gitlab-org/gitaly/-/blob/6abde2303696a63f9eec85365b144c3b208d8043/internal/gitaly/storage/storagemgr/transaction_manager.go#L329)
even if that
transaction does not necessarily clean up any log entry. The list of
keeparound entries is structured as a linked list. It removes leading
empty entries only.
* If the transaction signals the channel when the manager is not
listening, the signal is ignored.

The two problems could lead to the accumulation of unneeded log entries
on the disk. Consider the following scenario:
* Begin TX1
* Begin TX2
* Commit TX1 - The channel is signaled
* The manager begins cleaning up TX1
* Rollback TX2 - The channel is not listened by the manager. The signal
is ignored.
* The manager finishes cleaning up and starts processing new
transactions. It is not aware that the log entry TX2 is based on
should be removed.

Although the log entry will be cleaned up after Gitaly restarts, the log
entries might accumulate for days. To resolve this race, this commit
proposes two changes:
* Convert the signaling queue to a buffered channel
* Only signal the channel if the transaction modifies the head of the
in-memory list.

As a result, the above scenario becomes:
* Begin TX1
* Begin TX2
* Commit TX1 - The channel is signaled.
* The manager begins cleaning up TX1. The channel is cleaned up.
* Rollback TX2 - The channel is signaled.
* The manager finishes cleaning up TX1.
* It receives the last message from the buffered channel and starts to
clean up TX2.

If another transaction (TX3 for example) signals to that channel when
it's full, it means the manager still has not picked up the signal.
That's perfectly fine to let TX3 move on. The next time the manager
wakes up, it scans the head of the list, so it will clean up all empty
log entries that were referred by TX1, TX2, and TX3.

If the manager crashes in between, the unneeded log entries are still
removed when it wakes up.

For #5814

0c6a1788