Skip to content

Synchronize snapshots across hosts

Luke Champine requested to merge seed-backups into master

High level overview:

  • The renter now stores a list of each host that a snapshot is stored on. This list may be incomplete.
  • The renter spawns a background thread that ensures that each current host has each known snapshot.
  • The thread iterates through all snapshots and compares their host list to the set of current hosts.
  • If snapshot is stored on all hosts, nothing is done; if it is stored on zero hosts, it is deleted.
  • For each host the snapshot is not stored on, we download its entry table and confirm whether this is true (since the host's list may be incomplete)
  • If the snapshot is indeed not present on the host, we upload it to that host.
  • Additionally, if the host's entry table contains snapshots we were not aware of, we add those snapshots to our local set.
  • After all snapshots have been processed in this way, we save the new snapshot set to disk.

This process handles both replicating snapshots to new hosts and deleting unrecoverable snapshots. It should be noted that "deleted" snapshots are not necessarily gone forever: if, on a later iteration, we see the snapshot listed in a host's entry table, we will subsequently re-download that snapshot; then, on the next iteration, we will replicate the snapshot to any hosts who do not already have it.

The current implementation is somewhat wasteful. Specifically, it will download entry tables more frequently than it may need to. This can be addressed without too much work; however, in general I feel it is good to err towards fetching the latest entry table from the host, rather than relying on local data, which may be outdated.

This MR is untested -- actually, it doesn't even spawn the new thread. We can do that once the basic algorithm has been approved.

Edited by Luke Champine

Merge request reports