Skip to content

datastore: Do no enqueue duplicate events

Taking over from: !5271 (comment 1260734067)

The event queue is appended by calling the Enqueue function. This function inserts the provided row into the postgres table replication_queue. The issue is that it doesn't check for duplicates while inserting the row. This means at any given time we could have multiple events in the table for doing the same change.

This becomes unnecessary because while the events themselves are idempotent, they don't carry state with them. Therefore, when an event is processed at a given time, the nodes reflect the latest wanted state at that time, so having multiple events just becomes redundant.

While this would be best tackled with an UNIQUE CONSTRAINT INDEX, doing that would require a cleanup of the duplicate events in the table. This could lock the DB for a long time since some instances have rows going into millions.

To take a safer approach, this commit adds a check while inserting (this is slower than the index approach) to ensure we don't add a duplicate. In a following release, we will add a transaction to cleanup any duplicates and add the index. Finally, in a third release cycle we will remove the changes added in this commit and only keep the index.

Part of #3940 (closed)

Merge request reports