Draft: Patch rugged to hide libgit2 symbols
What does this MR do?
Re-install the rugged gem with a patched ext/rugged/extconf.rb that adds -Wl,--exclude-libs,ALL to $LDFLAGS, so that symbols from libgit2's statically-linked archives (most notably the bundled llhttp_*) are marked local in rugged.so instead of being globally exported.
Why
Without this patch, rugged.so globally exports 119 llhttp_* symbols from libgit2's bundled llhttp:
$ nm -D --defined-only /opt/gitlab/.../rugged/rugged.so | grep -c llhttp
119When llhttp-ffi (loaded transitively by the http gem) is mapped into the same Ruby process, the dynamic linker resolves its llhttp_* calls to rugged's symbols (different llhttp version, different ABI), corrupting the parser callbacks. Every Geo blob HTTP response then fails with status 34 (HPE_INVALID_STATUS).
This is the root cause behind gitlab#595139 (closed) and analyzed in detail in gitlab#597390 (closed). The currently-shipping workaround swaps the blob download path to Gitlab::HTTP behind the geo_blob_download_with_gitlab_http ops feature flag, but it has surfaced new bugs (gitlab#598020 (closed), gitlab#598514 (closed)) and leaves us maintaining two code paths (gitlab-org/gitlab#596934). Patching rugged's symbol visibility removes the root cause for all deployment shapes (Omnibus, CNG, Dedicated, dev).
What changed
| File | Change |
|---|---|
config/patches/rugged-hide-libgit2-symbols-1.9.0.patch |
New — 5-line extconf.rb patch adding the LDFLAG. |
config/software/ruby-rugged.rb |
New — software definition that uses gem-patch to re-install the rugged gem with the patch applied. Modelled on config/software/ruby-grpc.rb. |
config/projects/gitlab.rb |
Added dependency 'ruby-rugged' after ruby-grpc, so the patched re-install runs after gitlab-rails' bundle install. |
The fix mirrors the same pattern the ffi gem already uses for libffi (ffi/ext/ffi_c/extconf.rb#L54-L56):
# Ensure libffi symbols aren't exported when using static libffi.
# This is to avoid interference with other gems like fiddle.
append_ldflags "-Wl,--exclude-libs,ALL"Init_rugged and Ruby-side API symbols are compiled from .o files (not the static libgit2 archive) and remain globally visible, so Ruby's dlopen still works.
Verification (test plan)
1. Symbol check (primary regression test):
nm -D --defined-only /opt/gitlab/embedded/lib/ruby/gems/3.3.0/gems/rugged-1.9.0/lib/rugged/rugged.so | grep -c llhttp
# Expected: 0 (was 119 before patch)2. Reproducer from gitlab#597390 (closed):
Run snippet 5982447 inside the patched build — client.rb should print 200 (was 34).
3. Rugged smoke test:
require 'rugged'
puts Rugged::Reference.valid_name?("refs/heads/main") # => true
puts Rugged::Reference.valid_name?("refs/heads/..") # => false4. End-to-end: disable geo_blob_download_with_gitlab_http on staging-ref, trigger a Geo blob sync. The original http-gem path should now succeed without FFI corruption.
Why not Scott's two alternatives in #596934
Considered both upstream options before choosing the downstream patch:
- Bump
httpgem to 6.x (usesllhttpdirect C binding instead ofllhttp-ffion CRuby): blocked bykubeclient 4.13.0still pinninghttp >= 3.0, < 6.0(verified at the v4.13.0 tag's gemspec — the cap was not removed). ManageIQ/kubeclient#687 is open with no upstream movement since 2026-03-18. Even if unblocked, thellhttpgem also statically embeds llhttp.c sources without symbol-hiding flags, so the rugged collision could still corrupt the new path. - Bump Omnibus libffi 3.2.1 → 3.4.x+: doesn't address the symbol collision and is irrelevant for CNG/Dedicated (those use the precompiled
ffigem which statically embeds vendored libffi >3.5.2 already).
The downstream rugged patch is the smallest fix that addresses the actual root cause across all deployment shapes.
Related issues
- Closes part of gitlab#598564 (closed)
- Root cause analysis: gitlab#597390 (closed)
- Original symptom: gitlab#595139 (closed)
- Consolidation tracking: gitlab-org/gitlab#596934
- Companion CNG MR: gitlab-org/build/CNG!2916 (closed)
Checklist
See Definition of done.
Required
- MR title and description are up to date, accurate, and descriptive.
- MR targeting the appropriate branch.
- Latest Merge Result pipeline is green.
- When ready for review, MR is labeled workflowready for review per the Distribution MR workflow.
For GitLab team members
- The manual
Trigger:ee-packagejobs have a green pipeline running against latest commit. - Since
config/softwareandconfig/patchesdirectories are changed, thebuild-package-on-all-osjob within theTrigger:ee-packagedownstream pipeline must succeed. - If CI configuration is changed, the branch must be pushed to
dev.gitlab.orgto confirm regular branch builds aren't broken.
Expected
- Test plan indicating conditions for success has been posted (see Verification above).
- Documentation created/updated. Not applicable — internal build-system change.
- Tests added. Not applicable — verified via
nmsymbol count post-build; no unit-testable Ruby surface. - Integration tests added to GitLab QA. Not planned — Geo replication tests already cover the affected flow.