Skip to content

Backup applied log entries

Will Chandler requested to merge wc/backup-wal into master

The WAL provides us a deterministic way to restore a repository to a known state. To use this for disaster recovery, we need to store the applied entries externally before they are pruned.

Add a new LogEntryArchiver which will be notified of newly available entries, tar the entry directory, and write it to a backup.Sink, which may be either a local directory or object storage. It has a configurable number of worker goroutines to perform the backups.

Failures will typically share a common cause, such as connectivity issues with object storage or overtaxed system resources. The archiver will wait with an exponential backoff for subsequent failures before continuing on. There is no limit to the number of retries for an entry.

Each log entry is received from the LogManager and acknowledged to it in sequence, but a given entry may not be backed up in that order due to retries or goroutine scheduling. In this case, the entry will be stored by the archiver in a backlog until its LSN is reached, at which point it will be acknowledged.

Should a successfully backed up entry be resubmitted to the LogEntryArchiver, for example if the TransactionManager shutdown before reading the AcknowledgeTransaction response, the archiver will acknowledge the transaction a second time without creating a new backup.

Entries are stored by <PARTITION_ID>/, which allows access to the entries of a given repository or fork cluster based on their partitionID.

Edited by Will Chandler

Merge request reports