Future of replication and verification
There are two separate conversations happening about making improvements to Geo to make it easier to replicate and verify new data types, both in code and for the actual processing of events. I'd like to bring the conversations together in one place (here) so that we can figure out the best and most efficient path forward.
The two conversations are:
Replication : In gitlab-org/gitlab#12565 (closed), @dbalexandre created a diagram with an idea for using messaging to track and process events. Issue to consider changing how we process event information: gitlab-org/gitlab#27186 (closed)
Verification : In gitlab-org/gitlab#33624 (closed), @vsizov raised the idea about changing the verification framework using a single scheduler.
I have a feeling that both problems are similar and may be solvable in a similar way, helping to make the system more simple. It also seems that this may clear up some of the problems we have with slow queries if they are used to drive the way that the existing workers operate. There are a few questions that might help clear up my understanding here.
Questions
- We previously made changes to move away from the legacy queries to use FDW instead. What was it about the legacy queries that made them inefficient?
- Was a messaging system for either requirement considered when Geo was initially conceived? I'm looking to understand if we have thought about this method before and if we had previously come up with any considerations about it.
- Could both the repliation and verification sub-systems be changed to use messaging (albeit with separate queues) and would this be an easier system to extend and maintain than what exists currently?
/cc @geo-team