Skip to content

Geo: Fix verification failures of remote stored files

What does this MR do and why?

Describe in detail what your merge request does and why.

This fixes a bug in the beta feature GitLab managed object storage replication.

Before 14.2, with GitLab managed object storage replication, object stored files have a permanently red verification progress bar. This was known and accepted as a temporary annoyance since the feature was not yet GA.

Since 14.2, those files will fail verification and then sync state will immediately be marked failed as well. Those files will continue to exist on the secondary, but they will shown as "sync failed" in the Admin UI, and they will be regularly redownloaded. This is wasteful and obscures whether replication is working or not.

This affects the following data types + versions:

Data type From version
Package Registry 14.2
Pipeline Artifacts 14.2
Terraform State Versions 14.2
Infrastructure Registry 14.2
External MR diffs 14.6
LFS Objects 14.6
Pages Deployments 14.6
Uploads 14.6
CI Job Artifacts 14.6

Why are some only affected from 14.6? These data types gained the verification feature in 14.6.

With this MR, object stored files will become "verification_succeeded" immediately upon a successful sync. Verification is not supported for object stored files, so this is not quite accurate, but it will stop the confusing sync failure loop and it will avoid wasting resources.

Screenshots or screen recordings

These are strongly recommended to assist reviewers and reduce the time to merge your change.

Before

A recurring pair of geo.log lines:

{"severity":"INFO","time":"2022-01-27T01:20:18.141Z","correlation_id":"f113b53d965c6a1bcd4f9bdb4d882e1a","class":"Geo::BlobDownloadService","host":"172.16.123.1.nip.io","message":"Blob download","mark_as_synced":true,"download_success":true,"bytes_downloaded":59269,"primary_missing_file":false,"download_time_s":0.557,"reason":null}
{"severity":"ERROR","time":"2022-01-27T01:20:21.806Z","correlation_id":"640c068a3f9c77d7020b56eae206e64e","class":"Geo::UploadRegistry","host":"172.16.123.1.nip.io","message":"Error during verification","error":"File is not checksummable"}

After

No verification error after the successful sync:

{"severity":"INFO","time":"2022-01-27T05:17:31.705Z","correlation_id":"ef4f0e606a9f17b8057574640718e683","class":"Geo::BlobDownloadService","host":"172.16.123.1.nip.io","message":"Blob download","mark_as_synced":true,"download_success":true,"bytes_downloaded":46,"primary_missing_file":false,"download_time_s":0.47,"reason":null}

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

  • On master branch, with Geo setup, with object storage configured for at least Uploads, with "Allow replication of object stored files" enabled for the secondary site
  • Add an attachment to an issue or MR
  • See that it persists as a sync failure
  • Notice in geo.log that sync succeeds and verification fails, in a loop, with progressive backoff of retries
  • Now switch to this MR's branch and gdk restart rails-background-jobs
  • See that the sync failure goes away and becomes success
  • Notice no sync/verification loop in geo.log

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Michael Kozono

Merge request reports