Skip to content
Snippets Groups Projects

Geo::RepositorySyncWorker should attempt to sync all projects

Merged Toon Claes requested to merge tc-fix-retrying-same-repos into master
All threads resolved!

What does this MR do?

If some projects fail to sync, Geo::RepositySyncWorker should not take the same set of projects over and over again. But try to sync all of them.

Problem analysis

First iteration

#find_project_ids_not_synced returns a set of projects, e.g. A, B, and C.

#find_project_ids_updated_recently returns nothing, because the registry is empty.

So A, B, and C are attempted to sync, but fail, and are added to the registry.

Second and following iterations

#find_project_ids_not_synced returns the same set of projects because they are ordered by last_repository_updated_at, so: A, B, C.

#find_project_ids_updated_recently returns A, B, and C, because those are the only projects present in the registry.

These set are interleaved and #uniqed, resulting in A, B, and C. This happens over and over again.

Proposed solution

diff --git a/app/workers/geo/repository_sync_worker.rb b/app/workers/geo/repository_sync_worker.rb
index 9aa0bec00f..f5a3c591cc 100644
--- a/app/workers/geo/repository_sync_worker.rb
+++ b/app/workers/geo/repository_sync_worker.rb
@@ -24,7 +24,7 @@ module Geo
 
     def find_project_ids_not_synced
       current_node.projects
-                  .where.not(id: Geo::ProjectRegistry.synced.pluck(:project_id))
+                  .where.not(id: Geo::ProjectRegistry.pluck(:project_id))
                   .order(last_repository_updated_at: :desc)
                   .limit(db_retrieve_batch_size)
                   .pluck(:id)

So #find_project_ids_not_synced will only return projects that are not in the registry.

#find_project_ids_updated_recently already returns projects that are in the registry.

These are interleaved, and slowly all projects will get attempted to sync and added to the registry.

Pitfall

Due to the interleaving, every iteration projects that failed in a previous iteration are retried (those returned by #find_project_ids_updated_recently). So this will half the capacity to sync projects that are not attempted already.

Does this MR meet the acceptance criteria?

What are the relevant issue numbers?

Closes gitlab-org/gitlab-ee#3259

Edited by Nick Thomas

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Douglas Barbosa Alexandre
  • Toon Claes added 115 commits

    added 115 commits

    • c4441805...3701599d - 113 commits from branch master
    • ee8c49ff - Geo::RepositySyncWorker should attempt to sync all projects
    • 68d450d1 - Tune the spec to test the bare minimum

    Compare with previous version

  • Toon Claes added 1 commit

    added 1 commit

    • 4090c39f - fixup! Geo::RepositySyncWorker should attempt to sync all projects

    Compare with previous version

  • Toon Claes marked as a Work In Progress from 4090c39f

    marked as a Work In Progress from 4090c39f

  • Toon Claes marked the checklist item Changelog entry added, if necessary as completed

    marked the checklist item Changelog entry added, if necessary as completed

  • Toon Claes marked the checklist item Documentation created/updated as completed

    marked the checklist item Documentation created/updated as completed

  • Toon Claes marked the checklist item Documentation created/updated as incomplete

    marked the checklist item Documentation created/updated as incomplete

  • Toon Claes unmarked as a Work In Progress

    unmarked as a Work In Progress

  • Toon Claes changed the description

    changed the description

  • Toon Claes added 2 commits

    added 2 commits

    • 2551475a - Geo::RepositySyncWorker should attempt to sync all projects
    • 4c5810a6 - Tune the spec to test the bare minimum

    Compare with previous version

  • Author Maintainer

    @dbalexandre Can you have another look at the spec please?

  • added ~2278661b label

  • Toon Claes changed milestone to %9.5

    changed milestone to %9.5

  • Toon Claes changed the description

    changed the description

  • Toon Claes resolved all discussions

    resolved all discussions

  • @to1ne thanks! I left one note.

  • Toon Claes added 1 commit

    added 1 commit

    • 75a8059d - Explicitly test if every project id is being synced

    Compare with previous version

  • Toon Claes resolved all discussions

    resolved all discussions

  • Author Maintainer

    @dbalexandre Please check again if the current spec implementation is more sturdy?

  • Douglas Barbosa Alexandre
  • Nick Thomas
  • Toon Claes resolved all discussions

    resolved all discussions

  • Toon Claes added 359 commits

    added 359 commits

    • 75a8059d...71f09bb7 - 356 commits from branch master
    • 575b677f - Geo::RepositySyncWorker should attempt to sync all projects
    • d3bf53e3 - Tune the spec to test the bare minimum
    • df43dc0a - Explicitly test if every project id is being synced

    Compare with previous version

  • Author Maintainer

    @DouweM Can you take care of maintainer review?

  • assigned to @DouweM

  • Author Maintainer
  • @to1ne ok with me however I am not sure if it shouldn't be approved by @psimyn or @jivanvl as I am only trainee for 9.5 release...

  • Douwe Maan approved this merge request

    approved this merge request

  • Douwe Maan enabled an automatic merge when the pipeline for df43dc0a succeeds

    enabled an automatic merge when the pipeline for df43dc0a succeeds

  • Douwe Maan canceled the automatic merge

    canceled the automatic merge

  • Douwe Maan mentioned in commit 89ca2d0e

    mentioned in commit 89ca2d0e

  • merged

  • Nick Thomas changed title from Geo::RepositySyncWorker should attempt to sync all projects to Geo::RepositorySyncWorker should attempt to sync all projects

    changed title from Geo::RepositySyncWorker should attempt to sync all projects to Geo::RepositorySyncWorker should attempt to sync all projects

  • Yeah ~"Pick into Stable" is fine, this looks small enough no need for approval :slight_smile:

  • Author Maintainer

    No need to ~"Pick into Stable", our Geo testbed is running nightly builds. Sorry about that!

  • removed ~2278661b label

  • Toon Claes changed milestone to %10.0

    changed milestone to %10.0

  • Rachel Nienaber removed 1 deleted label

    removed 1 deleted label

  • Please register or sign in to reply
    Loading