Implement basic support for repository migrations
Gitaly has various housekeeping tasks to remove leftover temporary files from the repositories, such as loose reference locks, temporary object files and others. With transactions, the temporary state gets removed with the snapshot so no new leftover state will be produced. The existing leftover state may cause issues though. Existing locks in the repository could be included in the snapshot and cause problems. The housekeeping tasks no longer will clean the temporary files up as the cleaning targets the snapshot.
We should implement basic support for running migrations on repositories:
- Gitaly would keep an ordered list of repository migrations.
- It would record for each repository the last migration that was successfully run.
- When a transaction on repository is being started, Gitaly checks if all of the migrations have been run on the repository. If not, it runs the missing migrations in order one-by-one.
- Once all migrations have been applied to the accessed repositories, the transaction is allowed to begin.
Migrations are regular Go functions. They'll perform their changes in a transaction on the repository. The changes are committed through the WAL so they get backed up and replicated.
When a new repository is created, it's recorded to have applied all migrations. This avoids having to run all past migrations on new repositories and allows us to directly create the new state as expected.
The above approach should facilitate the first migration in Handle leftover temporary state in repo from be... (#5737) to remove the leftover state.
The log entries produced by migrations could be modeled in two ways:
- Record the file modifications to be done. For removing stale state, it would be enough to record the exact files and directories to remove. The log entry would be replicated, and each replica apply the changes.
- Record a log entry instructing the migration to be applied. Each replica performs the migration locally when applying the log entry.
The second approach would mean less data to replicate if the migration performs larger changes.