Geo: Patch rugged.so to hide libgit2's llhttp symbols
Summary
rugged.so globally exports 119 llhttp_* symbols from libgit2's statically-linked bundled llhttp. When llhttp-ffi (loaded transitively by the http gem) is mapped into the same process, the dynamic linker resolves its llhttp_* calls to rugged's symbols (different version, different ABI), corrupting parser callbacks. This is the root cause behind the Geo blob replication failures tracked in #595139 (closed) and identified in #597390 (closed).
The currently-shipping workaround swaps the blob download path to Gitlab::HTTP (HTTParty/Net::HTTP) behind the geo_blob_download_with_gitlab_http ops feature flag. It works, but it's introduced new bugs (#598020 (closed) — timeout ignored for >60s downloads; #598514 (closed) — allow_object_storage no-op for default-DNS S3) and leaves us maintaining two blob-download code paths (per #596934).
This issue tracks a permanent root-cause fix: patch rugged at build time to hide its statically-linked libgit2 symbols, so they no longer collide with llhttp-ffi. Once landed, the original http-gem path becomes safe to use and the Gitlab::HTTP feature flag can be evaluated for removal.
Root cause evidence
$ nm -D --defined-only $(gem which rugged | sed 's|/lib/rugged.rb|/rugged/rugged.so|') | grep -c llhttp
119All 119 are exported as T (global text), making them visible to subsequent dlopen. Verified at three levels — no symbol-hiding flags anywhere in the chain:
rugged 1.9.0'sext/rugged/extconf.rbsets only$CFLAGS << " -g -O3 -Wall -Wno-comment". No-fvisibility=hidden, no--exclude-libs. cmake builds vendored libgit2 static via-DBUILD_SHARED_LIBS=OFF.- GitHub code search across
libgit2/ruggedfor "visibility" returns 0 hits. libgit2 v1.9.xCMakeLists.txtand the bundleddeps/llhttp/CMakeLists.txt(built asadd_library(llhttp OBJECT ...)) have no visibility flags either. Default visibility leaks all symbols from the static archive into rugged.so.
Zero upstream issues filed at libgit2/rugged about this — search of the repo for "llhttp" or "symbol/visibility/collision" returns nothing.
Reproducer
Pure Ruby — no GitLab, no TLS, no Rails. Reproduces on Omnibus, CNG, and Dedicated. Loading rugged before http is the only trigger.
# client.rb
require 'rugged' # comment this out and the bug disappears
require 'http'
puts HTTP.get('http://127.0.0.1:8080/').status.code
# => 34 (expected: 200)Full reproducer (client.rb, server.py, Gemfile, README.md): https://gitlab.com/-/snippets/5982447
Why this fix vs. the alternatives
We considered three alternatives raised in #596934 (Scott Murray):
-
Bump
httpgem to 6.x (usesllhttpdirect C binding instead ofllhttp-ffion CRuby). Blocked bykubeclient4.13.0 still pinninghttp >= 3.0, < 6.0(despite earlier expectation that 4.13.0 removed the cap — verified via gemspec atv4.13.0tag). Upstream PR ManageIQ/kubeclient#687 is open with no movement since 2026-03-18. Even if unblocked, thellhttpgem also statically embeds llhttp.c sources without symbol-hiding flags — so the rugged collision could still corrupt the new path. -
Bump Omnibus libffi 3.2.1 → 3.4.x+ (static trampolines on Linux). Confirmed Omnibus pin and changelog claim. Irrelevant for CNG / Dedicated: CNG's
Gemfile.lockresolvesffi (1.17.4-x86_64-linux-gnu)— a precompiled binary that statically embeds vendored libffi at SHA2263d6037f8e(Dec 2025), 9 commits past v3.5.2. Static trampolines have been there for years. Bumping Omnibus libffi changes nothing for K8s deployments. -
Drop rugged entirely. !218195 (closed) "Remove Rugged from Gemfile" is open since 2026-01-08, but only removes the direct
gem 'rugged', '~> 1.6'line. Rugged still ships transitively vialicensee 9.18.0→rugged (>= 0.24, < 2.0)(production dep,Gemfile:355) andundercover 0.8.5(tooling). Replacing licensee is a bigger project per discussions in !196343 (closed).
The patch proposed here is the smallest fix that addresses the actual root cause for all deployment shapes.
Proposed fix
Add -Wl,--exclude-libs,ALL to $LDFLAGS in rugged's ext/rugged/extconf.rb. This tells the linker to mark all symbols from static archives (libgit2 → bundled llhttp/zlib/pcre/ntlmclient/xdiff) as local in rugged.so, while keeping Init_rugged and Ruby-API symbols (compiled from .o files, not archives) global.
The ffi gem already does exactly this for the same class of problem (ffi/ext/ffi_c/extconf.rb:54-56):
# Ensure libffi symbols aren't exported when using static libffi.
# This is to avoid interference with other gems like fiddle.
append_ldflags "-Wl,--exclude-libs,ALL"Patch:
diff --git a/ext/rugged/extconf.rb b/ext/rugged/extconf.rb
--- a/ext/rugged/extconf.rb
+++ b/ext/rugged/extconf.rb
@@ -13,6 +13,15 @@ $CFLAGS << " -g"
$CFLAGS << " -O3" unless $CFLAGS[/-O\d/]
$CFLAGS << " -Wall -Wno-comment"
+# Hide symbols from statically-linked libgit2 (and its bundled llhttp,
+# pcre, zlib, ntlmclient, xdiff dependencies) so they don't collide
+# with other gems' shared libraries at runtime. Without this, rugged.so
+# globally exports ~119 llhttp_* symbols which corrupt callbacks in
+# llhttp-ffi (loaded by the http.rb gem). See:
+# https://gitlab.com/gitlab-org/gitlab/-/issues/597390
+# Equivalent pattern used by the ffi gem for libffi.
+$LDFLAGS << " -Wl,--exclude-libs,ALL" unless Gem.win_platform?
+
cmake_flags = [ ENV["CMAKE_FLAGS"] ]
cmake_flags << "-DBUILD_CLI=OFF"
cmake_flags << "-DBUILD_TESTS=OFF"Init_rugged and Ruby-API symbols come from .o files (not the static libgit2 archive) and remain globally visible, so Ruby's dlopen still works.
Implementation plan
Apply the patch downstream via the existing gem-patch precedent already in use for grpc (FIPS):
omnibus-gitlab MR (omnibus-gitlab!9367 (closed))
- Add
config/patches/rugged-hide-libgit2-symbols-1.9.0.patch - Add
config/software/ruby-rugged.rbmodelled on the existingconfig/software/ruby-grpc.rb - Add
dependency 'ruby-rugged'toconfig/software/gitlab-rails.rb
CNG MR (gitlab-org/build/CNG!2916 (closed))
- Add
shared/build-scripts/patches/rugged-hide-libgit2-symbols-1.9.0.patch - Add
shared/build-scripts/patch-rugged-symbolsmodelled onshared/build-scripts/reinstall-grpc-if-fips, without the FIPS gate - Add invocation lines to
gitlab-rails/Dockerfile.erbandgitlab-rails/Dockerfile.build.ubi.erbafterbundle install
Upstream
File a corresponding issue/PR at libgit2/rugged so we can drop the downstream patch once it lands.
Verification
# Symbol check (primary regression test):
nm -D --defined-only /opt/gitlab/embedded/lib/ruby/gems/3.3.0/gems/rugged-1.9.0/lib/rugged/rugged.so | grep -c llhttp
# Expected: 0 (was 119 before patch)Then re-run the reproducer above (expect 200), and disable geo_blob_download_with_gitlab_http on staging-ref to confirm the original http-gem path is now stable.
Related
- #595139 (closed) — original FFI corruption bug (closed)
- #597390 (closed) — root-cause analysis of the rugged/llhttp-ffi symbol collision (closed)
- #596934 — Geo: Evaluate and consolidate blob download HTTP backend (this issue's fix unblocks consolidation)
- #598020 (closed) —
Gitlab::HTTPpath 60s timeout bug - #598514 (closed) —
Gitlab::HTTPpath object-storage URL blocker bug - !218195 (closed) — Remove Rugged from Gemfile (open, transitively-blocked by licensee)
- !230361 (merged) — Geo: Switch blob download to use GitLab::HTTP (the workaround MR, merged)