Geo: Bandaid for Registry rows stuck in sync state Queued
From #419370 (comment 1596281181):
I think this may be where we are setting registry record
statetopendingbut not clearinglast_synced_at. I think it doesn't always trigger the issue since that line is immediately followed bysync_repository, which quickly moves the state tostarted.I suspect that this issue only occurs when the lease is taken since the service exits without moving state to
started. So it is mostly an issue for frequently mutated resources.Possible bandaid: Clear
last_synced_atwhen setting state topending.
Backport the fix to 16.3 and 16.4.
Workaround to unstick any permanently Queued items
On a Puma, Sidekiq, or Geo Log Cursor node in the secondary site:
gitlab-rails runner "Geo::ProjectRepositoryRegistry.where(state: ['0']).where('last_synced_at is not null').update_all(last_synced_at: nil)"
Use a cronjob to run this every 10 minutes, for example.
Edited by Michael Kozono