Geo backfill process race with log cursor

When all the projects have been backfilled, the backfiller process starts a race with log cursor which leads to resource wasting and the geo log file is full of "Cannot obtain an exclusive lease". We should probably stop backfill jobs completely.

valery: Can anybody explain why we use ProjectUpdatedRecentlyFinder after ProjectUnsyncedFinder for projects? Doesn't it create too much concurrent workers? Isn't this a reason of large rate of "Cannot obtain an exclusive lease". I mean when we update some project, we create event + it can be returned by ProjectUpdatedRecentlyFinder so in other words, it creates too much redundancy. Am I missing something? (edited) 5 replies

douglas: IIRC, we call Project Update RecentlyFinder because we are not 100% sure that our event handling system on a Geo secondary is reliable, so in case we lost an event we can keep the repository up to date. And we only call it if there is remaining capacity in the worker.

valery: I think we should keep this finder but we should let the cursor time to do its job. Something like 20 minutes. WDYT?

valery: I will create an issue, I feel like it's an important one as we waste lots of resources and we log lots of garbage. Think of some project that already in sync for a long time, in this case, the main job of backfiller is now to compete with the cursor.

douglas: Yes, we should stop them as soon as we finished the initial backfill.

mkozono: This has been rolling around in my head as well. When we create a secondary GeoNode, we could store the highest ID of each replicable, and tell backfill processes to go super slow after that (or stop completely, if that replicable has verification)

Proposal

For project and wiki repos, remove "Recently Updated" ids from the backfill process. We have a verification anyway.

Edited Jan 10, 2020 by Toon Claes
Assignee Loading
Time tracking Loading