Geo backfill process race with log cursor
When all the projects have been backfilled, the backfiller process starts a race with log cursor which leads to resource wasting and the geo log file is full of "Cannot obtain an exclusive lease". We should probably stop backfill jobs completely.
valery: Can anybody explain why we use
ProjectUpdatedRecentlyFinderafterProjectUnsyncedFinderfor projects? Doesn't it create too much concurrent workers? Isn't this a reason of large rate of "Cannot obtain an exclusive lease". I mean when we update some project, we create event + it can be returned byProjectUpdatedRecentlyFinderso in other words, it creates too much redundancy. Am I missing something? (edited) 5 repliesdouglas: IIRC, we call Project Update RecentlyFinder because we are not 100% sure that our event handling system on a Geo secondary is reliable, so in case we lost an event we can keep the repository up to date. And we only call it if there is remaining capacity in the worker.
valery: I think we should keep this finder but we should let the cursor time to do its job. Something like 20 minutes. WDYT?
valery: I will create an issue, I feel like it's an important one as we waste lots of resources and we log lots of garbage. Think of some project that already in sync for a long time, in this case, the main job of backfiller is now to compete with the cursor.
douglas: Yes, we should stop them as soon as we finished the initial backfill.
mkozono: This has been rolling around in my head as well. When we create a secondary GeoNode, we could store the highest ID of each replicable, and tell backfill processes to go super slow after that (or stop completely, if that replicable has verification)
Proposal
For project and wiki repos, remove "Recently Updated" ids from the backfill process. We have a verification anyway.