Geo blob downloads blocked by network filtering for tenants using S3 and AWS DNS
Summary
Since 18.10.4, Geo blob downloads on the Gitlab::HTTP code path (gated by the :geo_blob_download_with_gitlab_http ops feature flag from !230361 (merged)) fail with Gitlab::HTTP_V2::BlockedUrlError "URL is blocked: Requests to hosts and IP addresses not on the Allow List are denied" on instances where:
ApplicationSetting#deny_all_requests_except_allowedistrue, AND- Object stores are configured using default AWS S3 region-based DNS (no explicit
connection.endpoint).
ee/lib/gitlab/geo/replication/blob_downloader.rb#stream_from_url passes allow_object_storage: true, which Gitlab::HTTP (lib/gitlab/http.rb) translates to extra_allowed_uris = ObjectStoreSettings.enabled_endpoint_uris. But enabled_endpoint_uris only returns object stores with an explicit endpoint:
endpoint = object_store_setting.dig('connection', 'endpoint')
next unless endpoint # <-- silently drops default-DNS AWS S3
URI(endpoint)Default AWS S3 configs derive the host from region + remote_directory, so this returns [], the allow_object_storage bypass is a no-op, and validate_resolved_uri falls through to validate_deny_all_requests_except_allowed!.
The old http-gem download path didn't use Gitlab::HTTP_V2::UrlBlocker, which masked this gap. The 18.10.4 backport is correct in routing through Gitlab::HTTP; it just exposes a pre-existing bug.
Distinct from but related to #544821 (closed).
Impact
- Contributed to a customer-facing S1 affecting multiple GitLab Dedicated tenants. All blob types (job artifacts, uploads, LFS, packages, MR diffs, terraform state, pipeline artifacts, CI secure files) failed to replicate.
- Affects any instance running 18.10.4+ with the FF enabled,
deny_all_requests_except_allowed = true, and default AWS S3 DNS for one or more object stores. This is the standard GitLab Dedicated configuration. - Workaround (per-instance): add every object-store bucket hostname (
<bucket>.s3.<region>.amazonaws.com) tooutbound_local_requests_whitelist.
Recommendation
Fix ObjectStoreSettings.enabled_endpoint_uris to derive the hostname when connection.endpoint is absent — for AWS, https://<remote_directory>.s3.<region>.amazonaws.com. Equivalent treatment likely needed for Google Cloud Storage and Azure Blob default-DNS configurations.
Add a regression spec asserting that an enabled AWS S3 object store with region set and no endpoint produces a non-empty enabled_endpoint_uris, and that BlobDownloader#execute succeeds against it under deny_all_requests_except_allowed: true.
Backport target: 18.10 (alongside the existing geo_blob_download_with_gitlab_http fix).
Verification
Reproduced and fix verified on a GitLab Dedicated tenant running 18.10.4-ee:
- Diagnostic state: FF enabled,
deny_all_requests_except_allowed? == true,ObjectStoreSettings.enabled_endpoint_uris == [], affected bucket hostname missing fromoutbound_local_requests_whitelist. - Synchronous
BlobDownloader#executeraisedGitlab::HTTP_V2::BlockedUrlError. Backtrace originates inGitlab::HTTP_V2::NewConnectionAdapter#validate_url_with_proxy!, called fromblob_downloader.rb#stream_from_urlviadownload_file_with_gitlab_http. - Adding the bucket hostname to
outbound_local_requests_whitelistand re-runningBlobDownloader#executesucceeded.
Related
- Related upstream issue (Geo site URL allowlisting): #544821 (closed)
- 18.10.4 fix that exposed this gap: !230361 (merged)
- Companion Geo timeout/JWT-reuse bug under the same FF: #598020 (closed)
- Companion container-registry sync bug (FFI corruption): #598388 (closed)