Skip to content

Support lazy failovers in `praefect dataloss`

Sami Hiltunen requested to merge smh-dataloss-lazy-failovers into master

With the recent failover changes, the output of praefect dataloss is no longer accurate. Previously a repository would have been in read-only mode if the primary of the repository was outdated. With lazy failovers in place, it's no longer sufficient to check only whether the current primary is outdated or not. If the current primary is outdated, Praefect would immediately switch the repository's primary on the next request if there is an up to date replica available. This also means that there is no 'read-only mode' anymore, as we'd simply failover to an up to date node rather than wait for the current primary to be brought up to speed. This commit updates the dataloss sub-command to take the new changes into account:

  1. If there is an up to date, available replica for the repository, it's considered to be available for both reads and writes.
  2. If there are no up to date replicas available, the repository is considered unavailable. As it is, Praefect does not distribute writes to outdated replicas.
  3. To make it easier to determine why a repository is unavailable, 'unavailable' is printed next to the storages which are considered to be unavailable by the consensus of the Praefect nodes.

Related to: #3207 (closed)
Documentation: gitlab!62704 (merged)
Depends on: !3543 (merged)

Edited by Sami Hiltunen

Merge request reports