Fix Rack::Timeout on tags page for repositories with many GPG-signed tags

Summary

Visiting the tags page (/-/tags) for repositories with a large number of GPG-signed tags causes a Rack::Timeout::RequestTimeoutException (60s timeout). This has been observed on https://gitlab.com/whiskeyo/wireshark/-/tags which has 1,026 tags.

Root Cause

The tag signature caching introduced in !207753 (merged) triggers GPG verification for all tags in the repository rather than just the tags on the current page. This is caused by two compounding issues:

1. Signature verification runs for ALL tags, not just the current page

In app/controllers/projects/tags_controller.rb, TagsFinder#execute is called with batch_load_signatures: true before Kaminari pagination is applied:

@tags = TagsFinder.new(@repository, tags_params).execute(batch_load_signatures: true)  # all 1,026 tags
@tags = Kaminari.paginate_array(@tags).page(tags_params[:page])  # then paginated to ~20

This means GPG signature verification is triggered for all 1,026 tags even though only ~20 are displayed.

2. Each GPG verification acquires a global mutex and creates a temporary keychain

Gitlab::Gpg.using_tmp_keychain (in lib/gitlab/gpg.rb) uses a process-wide Mutex (Gitlab::Gpg::MUTEX). For each tag, it:

  1. Acquires the mutex
  2. Creates a temporary directory via Dir.mktmpdir
  3. Sets GPGME::Engine.home_dir to the temp dir
  4. Imports the GPG key and performs cryptographic verification
  5. Cleans up the temp dir (with retries up to 1s in foreground via Retriable)

With ~1,026 tags at ~60ms each, total wall-clock time reaches ~60s, hitting the Rack timeout.

Stack Trace

Rack::Timeout::RequestTimeoutException: Request ran for longer than 60000ms
  from tmpdir.rb:95:in `mkdir'
  from tmpdir.rb:95:in `block in mktmpdir'
  from tmpdir.rb:156:in `create'
  from tmpdir.rb:93:in `mktmpdir'
  from lib/gitlab/gpg.rb:99:in `optimistic_using_tmp_keychain'
  from lib/gitlab/gpg.rb:82:in `block (2 levels) in using_tmp_keychain'
  from lib/gitlab/gpg.rb:81:in `synchronize'
  from lib/gitlab/gpg.rb:81:in `block in using_tmp_keychain'
  from active_support/concurrency/share_lock.rb:187:in `yield_shares'
  from active_support/dependencies/interlock.rb:41:in `permit_concurrent_loads'
  from lib/gitlab/gpg.rb:80:in `using_tmp_keychain'
  from lib/gitlab/gpg/signature.rb:64:in `using_keychain'
  from lib/gitlab/gpg/signature.rb:28:in `verification_status'
  from lib/gitlab/gpg/tag.rb:23:in `attributes'
  from lib/gitlab/signed_tag.rb:114:in `build_cached_signature'
  from lib/gitlab/signed_tag.rb:44:in `each'
  from lib/gitlab/signed_tag.rb:44:in `filter_map'
  from lib/gitlab/signed_tag.rb:44:in `batch_write_cached_signatures'
  from lib/gitlab/signed_tag.rb:102:in `block in lazy_cached_signature'
  from batch_loader.rb:92:in `__ensure_batched'

Proposed Solutions

Solution 1: Only verify signatures for the current page

Move signature batch loading to after pagination in the controller:

# app/controllers/projects/tags_controller.rb
@tags = TagsFinder.new(@repository, tags_params).execute(batch_load_signatures: false)
@tags = Kaminari.paginate_array(@tags).page(tags_params[:page])
# Load signatures only for the ~20 visible tags
batch_load_tag_signature_data(@tags)

This reduces work from ~1,026 to ~20 GPG verifications per request (~98% reduction).

Solution 2: Reuse a single tmp keychain per batch

Wrap the entire batch verification in a single using_tmp_keychain call in SignedTag.batch_write_cached_signatures, instead of creating/destroying a temp keychain per tag. The nested using_tmp_keychain calls inside Signature#using_keychain would reuse the existing keychain thanks to the MUTEX.locked? && MUTEX.owned? check in lib/gitlab/gpg.rb. This eliminates N mktmpdir calls, N mutex acquisitions, and N cleanup cycles, replacing them with 1 of each.

Recommendation

Solution 1 + Solution 2 together would fully resolve this issue with minimal risk. Solution 1 is the primary fix addressing the incorrect scope of verification. Solution 2 is a performance optimization that benefits even the paginated case.

Assignee Loading
Time tracking Loading