Make sure that the repository on the secondary matches the repository on the primary
From Nick's comment:
For ~Geo repositories on the secondary, we're also interested in making sure that the repository on the secondary matches the repository on the primary.
For instance, if an event reordering has erroneously caused repository X to be associated with project Y, we need to know about it, and just checking that it's a valid git repository isn't enough.
I'm unsure what commit data we store in the database, and whether there's anything we can use for verification there, or if we'd just have to attempt to fetch from the geo remote on the secondary for every repository.
ISTR that if you try to fetch from a repository with the wrong initial commit, it'll fail with an error?It does not https://gitlab.com/gitlab-org/gitlab-ce/issues/40228#note_51440433
A few thoughts on how we can achieve this:
- Use https://docs.gitlab.com/ce/administration/repository_checks.html in the secondary to detect any repository errors. This won't probably detect the issue about fetching from a different remote.
- Use 1. in the primary. Hash the result of the fsck in the primary with a column. Run a specific worker on the secondary that checks the timestamp and the result checksum against the DB column. As pointed in https://docs.gitlab.com/ce/administration/repository_checks.html,
fsck
causes too many false alarms. Hence hashing the result might be better than just a check. - Keep the last pushed checksum or something along those lines in a DB column, and save this info in the primary with a timestamp. Then similarly to 2, run a secondary worker that checks this column for a match.
- Use an existing DB column for achieving 3. I'm not aware of anything that we can reuse here, we could do something like getting latest updated MR HEAD SHA, but it won't be 100% accurate (but enough to know if the repo is still the same)
We could also use a mix & match of 1 and 4 - this would be a simple solution, that may detect issues with the repo in the secondary (although we'll get false positives, which we won't get with 2), and 4 will tell us if the right repo has been updated.
There could be more elegant solutions that I haven't thought about, though. But this is to name a few :)