Skip to content

Fix syncing remote stored Blobs with filenames with plus sign

What does this MR do and why?

The http gem can re-encode or normalize URLs in ways that break AWS S3 signatures. When it sees a %2B in the pre-signed URL, it converts %2B to +, which changes the signature. These changes preserve the exact string representation of the URI. This ensures all encoded characters remain intact for S3 signatures while still parsing it as a valid URI object.

These changes skip the URI normalization for presigned URLs when syncing blobs on the secondary site.

HTTP::URI::NORMALIZER.call("https://example.com/foo%2Bbar")
=> #<HTTP::URI:0x00000000048a08 URI:https://example.com/foo+bar>

How to set up and validate locally

  1. Configure Geo
  2. Configure both primary and secondary GDK to use Object Storage with Amazon S3 as a provider for uploads
  3. Stop both primary and secondary GDKs
  4. Switch to the master branch on both GDKs
  5. Start both primary and secondary GDKs
  6. Upload a file with a plus sign in the filename, e.g Anc-offer__DVL___INT__Cpu_______33137+7500_______40637m, to an issue, merge request, or comment.
  7. Wait for the secondary GDKs to catch up and sync the file.
  8. Start a Rails console on the secondary site GDK:
    1. bundle exec rails console

    2. Check the latest upload registry attributes. You must see the Non-success HTTP response status code 403 error message.

      > Geo::UploadRegistry.last
      => #<Geo::UploadRegistry:0x00000003145167d0
       id: 99,
       file_id: 98,
       created_at: Thu, 28 Aug 2025 18:14:05.213143000 UTC +00:00,
       retry_count: 7,
       retry_at: Thu, 28 Aug 2025 19:35:36.759849000 UTC +00:00,
       missing_on_primary: false,
       state: 3,
       last_synced_at: Thu, 28 Aug 2025 18:53:26.872753000 UTC +00:00,
       last_sync_failure: "Non-success HTTP response status code 403",
       verified_at: nil,
       verification_started_at: nil,
       verification_retry_at: nil,
       verification_state: 0,
       verification_retry_count: 0,
       verification_checksum: nil,
       verification_checksum_mismatched: nil,
       checksum_mismatch: false,
       verification_failure: nil>
  9. Stop both primary and secondary GDKs
  10. Switch to the 456901-skip-uri-normalization-for-presigned-urls branch on both GDKs
  11. Start both primary and secondary GDKs
  12. Repeat the steps above to upload the file and to check the upload registry attributes. If you prefer you can force a resync in a Rails console session on the secondary GDKs.
    1. Geo::UploadRegistry.last.replicator.sync

    2. Check the upload registry attributes. You must not see the error message above and the state should be 2.

      > Geo::UploadRegistry.last
      => #<Geo::UploadRegistry:0x000000031907a280
       id: 99,
       file_id: 98,
       created_at: Thu, 28 Aug 2025 18:14:05.213143000 UTC +00:00,
       retry_count: 0,
       retry_at: nil,
       missing_on_primary: false,
       state: 2,
       last_synced_at: Thu, 28 Aug 2025 19:27:44.397304000 UTC +00:00,
       last_sync_failure: nil,
       verified_at: nil,
       verification_started_at: nil,
       verification_retry_at: nil,
       verification_state: 0,
       verification_retry_count: 0,
       verification_checksum: nil,
       verification_checksum_mismatched: nil,
       checksum_mismatch: false,
       verification_failure: nil>

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related issues

Related to #456901 (closed)

Edited by Douglas Barbosa Alexandre

Merge request reports

Loading