Fix syncing remote stored Blobs with filenames with plus sign
What does this MR do and why?
The http gem can re-encode or normalize URLs in ways that break AWS S3 signatures. When it sees a %2B in the pre-signed URL, it converts %2B to +, which changes the signature. These changes preserve the exact string representation of the URI. This ensures all encoded characters remain intact for S3 signatures while still parsing it as a valid URI object.
These changes skip the URI normalization for presigned URLs when syncing blobs on the secondary site.
HTTP::URI::NORMALIZER.call("https://example.com/foo%2Bbar")
=> #<HTTP::URI:0x00000000048a08 URI:https://example.com/foo+bar>
How to set up and validate locally
- Configure Geo
- Configure both primary and secondary GDK to use Object Storage with Amazon S3 as a provider for uploads
- Stop both primary and secondary GDKs
- Switch to the
masterbranch on both GDKs - Start both primary and secondary GDKs
- Upload a file with a plus sign in the filename, e.g
Anc-offer__DVL___INT__Cpu_______33137+7500_______40637m, to an issue, merge request, or comment. - Wait for the secondary GDKs to catch up and sync the file.
- Start a Rails console on the secondary site GDK:
-
bundle exec rails console -
Check the latest upload registry attributes. You must see the
Non-success HTTP response status code 403error message.> Geo::UploadRegistry.last => #<Geo::UploadRegistry:0x00000003145167d0 id: 99, file_id: 98, created_at: Thu, 28 Aug 2025 18:14:05.213143000 UTC +00:00, retry_count: 7, retry_at: Thu, 28 Aug 2025 19:35:36.759849000 UTC +00:00, missing_on_primary: false, state: 3, last_synced_at: Thu, 28 Aug 2025 18:53:26.872753000 UTC +00:00, last_sync_failure: "Non-success HTTP response status code 403", verified_at: nil, verification_started_at: nil, verification_retry_at: nil, verification_state: 0, verification_retry_count: 0, verification_checksum: nil, verification_checksum_mismatched: nil, checksum_mismatch: false, verification_failure: nil>
-
- Stop both primary and secondary GDKs
- Switch to the
456901-skip-uri-normalization-for-presigned-urlsbranch on both GDKs - Start both primary and secondary GDKs
- Repeat the steps above to upload the file and to check the upload registry attributes. If you prefer you can force a resync in a Rails console session on the secondary GDKs.
-
Geo::UploadRegistry.last.replicator.sync -
Check the upload registry attributes. You must not see the error message above and the
stateshould be2.> Geo::UploadRegistry.last => #<Geo::UploadRegistry:0x000000031907a280 id: 99, file_id: 98, created_at: Thu, 28 Aug 2025 18:14:05.213143000 UTC +00:00, retry_count: 0, retry_at: nil, missing_on_primary: false, state: 2, last_synced_at: Thu, 28 Aug 2025 19:27:44.397304000 UTC +00:00, last_sync_failure: nil, verified_at: nil, verification_started_at: nil, verification_retry_at: nil, verification_state: 0, verification_retry_count: 0, verification_checksum: nil, verification_checksum_mismatched: nil, checksum_mismatch: false, verification_failure: nil>
-
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Related issues
Related to #456901 (closed)