Geo: exclusive leases are too slow for Geo::ProjectSyncWorker and GeoFileDownloadWorker

As seen on the geo testbed

Currently, we have about 75 sync jobs running in parallel on the testbed, inside a sidekiq process - 25 repository syncs, 50 file syncs.

Each of these jobs checks out an exclusive lease before starting work. Exclusive leases run a Lua script inside redis to do their magic.

Method call wait time for obtaining these leases is sometimes unreasonably high - 25 seconds or more. Given that a file download worker, when operating well, often completes in under a second, this seems unreasonable.

https://performance.gitlab.net/dashboard/db/sidekiq-workers?orgId=1&from=now-30m&to=now&var-worker=Geo::ProjectSyncWorker%23perform&var-database=GeoTestBed%20-%20Sync

Screenshot_from_2017-10-21_21-53-43

https://performance.gitlab.net/dashboard/db/sidekiq-workers?orgId=1&from=now-30m&to=now&var-worker=GeoFileDownloadWorker%23perform&var-database=GeoTestBed%20-%20Sync

Screenshot_from_2017-10-21_21-55-23

I expect the highly parallel nature of these leases is causing problems... but do we even need them at all? GeoFileDownloaderWorker is idempotent, and it shouldn't matter if it runs twice, and I'd be surprised if running git fetch in parallel on a git repository caused corruption.

So perhaps we can remove the leases entirely?

/cc @brodock @stanhu @dbalexandre @to1ne

Edited Oct 21, 2017 by Nick Thomas
Assignee Loading
Time tracking Loading