Skip to content

coordinator: Fix outdated repos not getting repljobs with transactions

When routing repository-scoped mutators, we have three sets of nodes: the primary node, secondaries and then the repliction targets, which is any out-of-date or unhealthy node at the time when the RPC gets routed. In the non-transactional case, we always route the mutator to the primary and then create replication jobs for the secondaries and replication targets. When the RPC uses transactions though, then we route to both primary and secondaries at the same time, where we then replicate to any failed secondary.

Notably missing in the transactional case is that we do not create replication jobs for replication targets. So right now, nodes which are not part of the transaction at all will not get repaired and will thus stay unhealthy until a non-transactional RPC comes along and causes replication jobs to be created.

Fix the issue by always creating replication jobs for replication targets in the transactional case, too.

Edited by Patrick Steinhardt

Merge request reports