Geo: Package file marked synced but is actually missing
Summary
In a review app in Dedicated on GitLab 16.4.2 (internal Slack thread), the secondary site had some package files which were marked synced
but actually the files were missing in the secondary site's bucket. The verification job was getting 403 Forbidden when trying to verify the file size of a package file that didn't exist.
But the 403 Forbidden error was long, and the Geo verification code would try to transition the package file registry to verification_failed
.
Unfortunately, another error would be raised: Cannot transition verification_state via :verification_failed from :verification_started (Reason(s): Verification failure is too long (maximum is 255 characters))
This second error obscures the real error.
The registry record was left in the verification_started
state.
After 8 hours, a background job moves the registry record to verification_failed
with verification_failure message: Verification timed out after 28800
.
Then, Geo tries to verify the registry again. Repeat from step 1.
When we resynced the affected package files, they synced successfully and verified successfully. So the problem is that the initial sync should have failed.
Steps to reproduce
Unknown. We need to find steps to reproduce.
Example Project
What is the current bug behavior?
What is the expected correct behavior?
Relevant logs and/or screenshots
Output of checks
Results of GitLab environment info
Expand for output related to GitLab environment info
(For installations with omnibus-gitlab package run and paste the output of: `sudo gitlab-rake gitlab:env:info`) (For installations from source run and paste the output of: `sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production`)
Results of GitLab application Check
Expand for output related to the GitLab application check
(For installations with omnibus-gitlab package run and paste the output of:
sudo gitlab-rake gitlab:check SANITIZE=true
)(For installations from source run and paste the output of:
sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true
)(we will only investigate if the tests are passing)