Geo: Use Exclusive lease across more services that modify projects
From https://gitlab.com/gitlab-org/gitlab-ee/issues/5876#note_109338961:
There are a number of races that may not be a problem. But this depends on an error occurring early, and the job retrying until the other event is finished processing:
- Create-Rename race
- Create-Migrate to hashed storage race
- Update-Rename race
- Update-Migrate to hashed storage race
- Rename-Update race
- Rename-Delete race
- Rename-Rename race
- Rename-Migrate to hashed storage race
- Migrate to hashed storage-Update race
- Migrate to hashed storage-Rename race
- Migrate to hashed storage-Delete race
There are at least 2 problems with this:
- It may be possible for some events to interact adversely. E.g. I'm not 100% sure whether
mv_repository
duringgit fetch
will actually attempt to move files or not. Regardless, it'd be better to avoid the interaction altogether.- Retries are limited in number, and may fail and disappear before the other event finishes.
I think we can solve 1) by using ExclusiveLease more. Note that it won't be quite as simple as reusing the same lease key, since we still need to ensure the events are processed (Repo sync jobs can be lost with little consequence because they'll be rescheduled if needed. This is how we handle failures to get a lease during repo syncs).
A lease failure could raise an error for non-sync jobs, so the job is retried. Should we increase the number of job retries? We changed the Sidekiq default of most jobs to 3 retries and then they move to the dead queue, capped at 10k jobs, and then they disappear.