Don't consider errors when deciding whether nodes are up to date after a transaction
getUpdatedAndOutdatedSecondaries
decides in Praefect which nodes need replication jobs following a transaction and which do not. Currently the function contains quite a few conditions to cover various cases and it also considers the errors the nodes return when deciding whether they need replication.
As a simple example how the error checks can cause excess replication, it could be that the primary and the secondaries all perform the same changes on disk. After performing the changes, the primary crashes and doesn't return a successful response to Praefect. The current logic would consider the secondaries outdated as the secondaries did not error and the primary did. This is unncesssary as they all performed the same changes and have the same state on the disk now.
To improve the situation, we should stop observing the returned error codes and decide the replication need purely from the votes in transactions. We'd check each of the subtransactions that make up the full transaction. Basic idea is:
- If primary casted a vote and a secondary voted differently, we'd consider the secondary outdated.
- If the primary didn't cast a vote, we don't have to consider the secondary outdated. The primary likely failed somehow.
We'd have to differentiate between the votes coming from the prepared
vs the committed
phase of the reference transaction. If the primary didn't cast a vote during the prepared
phase, the changes wouldn't go through. As long as the secondaries have matched the primary's votes during all previous subtransactions, they should be fully up to date. If there's a committed subtransaction that corresponds to the prepared
phase, we'd only consider the nodes up to date if the primary votes on the committed
phase and the secondaries have voted the same. This is because we don't know whether the primary made the changes or not before it confirms it in the committed
phase.
The above should generalize the logic and avoid excess replication. It reduces the window where we might consider secondaries unnecessarily outdated (causing problems like #3605 (closed)) from the first committed subtransaction to the window between the committed prepared
phase and the subsequent commit
phase.