Skip to content

WIP: Verify repositories on the Geo secondary

Brett Walker requested to merge 4754-verify-repositories-on-the-secondary into master

What does this MR do?

As outlined in #4754 (closed) and #4469 (closed), we want to verify repositories on the secondary match the repositories on the primary.

Assumption: a project's checksum is stored with the project in the database, and gets replicated into the secondary.

  • Add new verification/checksum columns in the project_registry table of the Geo tracking database
  • Each time a project is synced on the secondary, it's checksum is cleared in the ProjectRegistry.
  • Every 6 hours, a job kicks off that scans the ProjectRegistry looking for projects that have not been checksumed.
  • If it's stable (project has not been synced in 6 hours and the project has a recently computed checksum), then compute the checksum and verify it matches the main project checksum.
  • If the checksum fails, mark it as failed in the ProjectRegistry.
  • Enhance the API GET /geo_nodes/current/failures to return the failures.
  • Show status of failures and stats to admin user. Consider adding a "Data Integrity" or "Repository Verification" panel to the "Monitoring" section
  • Use multiple Sidekiq jobs to verify all of GitLab.com, careful not to cause an inordinate amount of CPU and I/O load

Are there points in the code the reviewer needs to double check?

You can take a look at Geo::RepositoryVerificationWorker#should_verify_repository?(registry, type) to double check the conditions for determining that a verification is needed.

Why was this MR needed?

To detect any integrity problems with synced repositories

Does this MR meet the acceptance criteria?

  • Changelog entry added, if necessary
  • Documentation created/updated
  • API support added
  • Tests added for this feature/bug
  • Review
    • Has been reviewed by UX
    • Has been reviewed by Frontend
    • Has been reviewed by Backend
    • Has been reviewed by Database
  • Conform by the merge request performance guides
  • Conform by the style guides
  • Squashed related commits together
  • [-] Internationalization required/considered
  • [-] If paid feature, have we considered GitLab.com plan and how it works for groups and is there a design for promoting it to users who aren't on the correct plan
  • End-to-end tests pass (package-qa manual pipeline job)

What are the relevant issue numbers?

Closes #4754 (closed)

Edited by Brett Walker

Merge request reports