An error occurred while fetching the assigned milestone of the selected merge_request.
Geo::RepositorySyncWorker should attempt to sync all projects
All threads resolved!
If some projects fail to sync, Geo::RepositySyncWorker should not take the same set of projects over and over again. But try to sync all of them.
#find_project_ids_not_synced
returns a set of projects, e.g. A
, B
, and C
.
#find_project_ids_updated_recently
returns nothing, because the registry is empty.
So A
, B
, and C
are attempted to sync, but fail, and are added to the registry.
#find_project_ids_not_synced
returns the same set of projects because they are ordered by last_repository_updated_at
, so: A
, B
, C
.
#find_project_ids_updated_recently
returns A
, B
, and C
, because those are the only projects present in the registry.
These set are interleaved and #uniq
ed, resulting in A
, B
, and C
. This happens over and over again.
diff --git a/app/workers/geo/repository_sync_worker.rb b/app/workers/geo/repository_sync_worker.rb
index 9aa0bec00f..f5a3c591cc 100644
--- a/app/workers/geo/repository_sync_worker.rb
+++ b/app/workers/geo/repository_sync_worker.rb
@@ -24,7 +24,7 @@ module Geo
def find_project_ids_not_synced
current_node.projects
- .where.not(id: Geo::ProjectRegistry.synced.pluck(:project_id))
+ .where.not(id: Geo::ProjectRegistry.pluck(:project_id))
.order(last_repository_updated_at: :desc)
.limit(db_retrieve_batch_size)
.pluck(:id)
So #find_project_ids_not_synced
will only return projects that are not in the registry.
#find_project_ids_updated_recently
already returns projects that are in the registry.
These are interleaved, and slowly all projects will get attempted to sync and added to the registry.
Due to the interleaving, every iteration projects that failed in a previous iteration are retried (those returned by #find_project_ids_updated_recently
). So this will half the capacity to sync projects that are not attempted already.
Closes gitlab-org/gitlab-ee#3259