POC: Optimize CI Variables Collection and Item for reduced allocations

What does this MR do and why?

This change introduces performance optimizations to reduce object allocations in the CI variable collection hot paths during pipeline creation. The optimizations are behind the feature flag ci_optimize_variables_collection_and_item.

Key optimizations:

  • Skip dup in Item.fabricate when input is already an Item
  • Skip symbolize_keys when hash keys are already symbols
  • Reuse Item objects directly in Collection#concat when source is a Collection
  • Reuse Item objects in Collection#initialize when initialized with existing Items

References

Related to #512876

Screenshots or screen recordings

Production data:

curl "https://gitlab.com/api/v4/projects/278964/merge_requests/225675/pipelines?performance_bar=flamegraph" \
  -X POST \
  -o flamegraph-225675-flamegraph.json \
  -H 'accept: application/json' \
  -H 'content-type: application/json' \
  -H 'cookie: ---' \
  -H 'x-csrf-token: ---'
==================================
  Mode: wall(10100)
  Samples: 2092 (53.88% miss rate)
  GC: 377 (18.02%)
==================================
     TOTAL    (pct)     SAMPLES    (pct)     FRAME
       270  (12.9%)         270  (12.9%)     (marking)
       112   (5.4%)         112   (5.4%)     PG::Connection#exec_params
       109   (5.2%)         109   (5.2%)     (sweeping)
        69   (3.3%)          69   (3.3%)     IO#wait_readable
        60   (2.9%)          60   (2.9%)     GRPC::Core::Call#run_batch
       780  (37.3%)          56   (2.7%)     Class#new
        81   (3.9%)          52   (2.5%)     Gitlab::Ci::Variables::Collection#initialize
       135   (6.5%)          40   (1.9%)     Gitlab::Ci::Pipeline::Expression::Lexer#tokenize
       209  (10.0%)          39   (1.9%)     Gitlab::Ci::Variables::Collection#append
       231  (11.0%)          36   (1.7%)     ActiveModel::EachValidator#validate
        38   (1.8%)          30   (1.4%)     Kernel#dup
        25   (1.2%)          25   (1.2%)     StringScanner#scan
        72   (3.4%)          20   (1.0%)     Enumerable#flat_map
        40   (1.9%)          20   (1.0%)     ActiveSupport::HashWithIndifferentAccess#[]=
        20   (1.0%)          20   (1.0%)     Hash#[]=
        83   (4.0%)          18   (0.9%)     Delegator#method_missing
       297  (14.2%)          17   (0.8%)     Array#map
        17   (0.8%)          17   (0.8%)     OpenSSL::KDF.pbkdf2_hmac
      1659  (79.3%)          17   (0.8%)     Array#each
       107   (5.1%)          15   (0.7%)     Gitlab::Ci::Variables::Collection::Item.fabricate
        81   (3.9%)          14   (0.7%)     Flipper::Feature#enabled?
        14   (0.7%)          14   (0.7%)     Gitlab::Ci::Variables::Collection::Item#initialize
        17   (0.8%)          14   (0.7%)     Delegator#target_respond_to?
       403  (19.3%)          14   (0.7%)     Kernel#tap
        13   (0.6%)          13   (0.6%)     Symbol#to_s
        99   (4.7%)          13   (0.6%)     Gitlab::Ci::Pipeline::Expression::Parser#tree
       548  (26.2%)          13   (0.6%)     Enumerable#find
        13   (0.6%)          13   (0.6%)     Regexp#match
       284  (13.6%)          12   (0.6%)     ActiveSupport::Callbacks::Filters::Before#call
        11   (0.5%)          11   (0.5%)     String#match?

Flamegraph profiling of pipeline creation revealed significant CPU time spent in variable handling:

  • (marking) (GC marking redundant objects)
  • (sweeping) (GC sweeping redundant objects)
  • Gitlab::Ci::Variables::Collection#append
  • Gitlab::Ci::Variables::Collection#initialize
  • Gitlab::Ci::Variables::Collection::Item.fabricate

Benchmark

Test script

# RAILS_PROFILE=true GITALY_DISABLE_REQUEST_LIMITS=true rails console

require 'benchmark'
 
ActiveRecord::Base.logger = nil
@project = Project.find_by_full_path('gitlab-org/gitlab-clone')
@user = @project.first_owner
@merge_request = MergeRequest.find(153)
@merge_request_params = { allow_duplicate: true }

@n = 10

def run
  Gitlab::SafeRequestStore.ensure_request_store do
    pipeline = ::MergeRequests::CreatePipelineService.new(
      project: @project, current_user: @user, params: @merge_request_params
    ).execute(@merge_request).payload

    raise "stop!" unless pipeline.created_successfully?
  end
end
 
# warm up
run
run
Benchmark.bm do |benchmark|
  benchmark.report("nochange") do # nochange, old, and new
    @n.times do
      run
    end
  end
end

(@n + 2).times do
  ::Ci::DestroyPipelineService.new(@project, @user).execute(Ci::Pipeline.last)
end

Result

Ran with:

  1. No change, on master
              user     system      total        real
nochange 49.882477   0.787781  50.670258 ( 67.084736)
  1. Feature.remove(:ci_optimize_variables_collection_and_item)
         user     system      total        real
old 52.515001   0.864138  53.379139 ( 67.952408)
  1. Feature.enable(:ci_optimize_variables_collection_and_item)
         user     system      total        real
new 47.385361   0.808716  48.194077 ( 62.390174)

Summary of Benchmark

Per-pipeline improvement: ~470ms real time saved.

Note: Local test data has limited jobs, variables, etc. Production pipelines with more jobs and variables should see greater improvements due to reduced object allocations and GC pressure.

Memory

Method

# RAILS_PROFILE=true GITALY_DISABLE_REQUEST_LIMITS=true rails console

ActiveRecord::Base.logger = nil
@project = Project.find_by_full_path('gitlab-org/gitlab-clone')
@user = @project.first_owner
@merge_request = MergeRequest.find(153)
@merge_request_params = { allow_duplicate: true }

require 'memory_profiler'

def run
  Gitlab::SafeRequestStore.ensure_request_store do
    pipeline = ::MergeRequests::CreatePipelineService.new(
      project: @project, current_user: @user, params: @merge_request_params
    ).execute(@merge_request).payload

    raise "stop!" unless pipeline.created_successfully?
  end
end

run

report = MemoryProfiler.report do
  run
end

2.times do
  Gitlab::SafeRequestStore.ensure_request_store do
    ::Ci::DestroyPipelineService.new(@project, @user).execute(Ci::Pipeline.last)
  end
end

io = StringIO.new
report.pretty_print(io, detailed_report: true, scale_bytes: true, normalize_paths: true)
puts io.string

Overview

Metric Master FF Disabled FF Enabled FF Removed Delta (FF Removed vs Master)
Total allocated 1.45 GB (14.27M obj) 1.50 GB (15.66M obj) 1.39 GB (13.77M obj) 1.31 GB (12.94M obj) -140 MB (-9.7%)
Total retained 345.69 MB (2.88M obj) 345.69 MB (2.88M obj) 302.58 MB (2.34M obj) 302.58 MB (2.34M obj) -43.11 MB (-12.5%)

Key Findings

1. FF disabled path adds overhead

With the feature flag disabled, total allocation increases by ~50 MB and ~1.4M objects compared to master. This is due to the feature flag check overhead and the structural changes in the code that still exist but fall back to the old behavior.

2. Retained memory reduction is significant

The 43 MB retained memory reduction (12.5%) is the most impactful metric; retained memory directly affects long-term memory growth of worker processes.

3. Removing the FF unlocks the full allocation savings

The FF-enabled path saves ~60 MB allocated vs master, but removing the FF entirely saves ~140 MB. The feature flag conditional branching and dual code paths cost ~80 MB in transient allocations.

4. Collection::Item is the biggest winner

Metric Master FF Removed Delta
Retained memory (Collection::Item class) 71.20 MB 28.10 MB -43.10 MB (-60%)
Retained objects (Collection::Item class) 890,010 351,199 -538,811 (-60%)
Allocated memory (item.rb file) 277.13 MB 199.28 MB -77.85 MB (-28%)

5. symbolize_keys elimination

activesupport/core_ext/hash/keys.rb dropped from 68.68 MB → 12.40 MB allocated (-56.28 MB, -82%). This confirms the elimination of massive symbolize_keys calls in the hot path.

6. Hash allocations reduced

Metric Master FF Removed Delta
Allocated Hash memory 546.02 MB 489.73 MB -56.29 MB (-10%)
Allocated Hash objects 2,680,978 2,329,225 -351,753 (-13%)

Allocated Memory by Top Files

File Master FF Disabled FF Enabled FF Removed
variables/collection/item.rb 277.13 MB 277.13 MB 255.57 MB 199.28 MB
variables/collection.rb 174.29 MB 230.19 MB 193.51 MB 174.29 MB
marginalia/comment.rb 133.04 MB 133.04 MB 133.04 MB 133.04 MB
hash/keys.rb 68.68 MB 68.68 MB 12.40 MB 12.40 MB
entry/configurable.rb 57.82 MB 57.82 MB 57.82 MB 57.82 MB

Retained Memory by Top Files

File Master FF Disabled FF Enabled FF Removed
variables/collection.rb 126.69 MB 126.69 MB 126.69 MB 126.69 MB
variables/collection/item.rb 127.83 MB 127.83 MB 84.72 MB 84.72 MB
deep_dup.rb 10.56 MB 10.56 MB 10.56 MB 10.56 MB
deep_mergeable.rb 10.18 MB 10.18 MB 10.18 MB 10.18 MB
variables/helpers.rb 8.18 MB 8.18 MB 8.18 MB 8.18 MB

Allocated Memory by Top Classes

Class Master FF Disabled FF Enabled FF Removed
Hash 546.02 MB 546.02 MB 489.74 MB 489.73 MB
String 273.55 MB 273.53 MB 273.53 MB 273.53 MB
Array 235.43 MB 291.34 MB 254.64 MB 235.42 MB
Collection::Item 106.03 MB 106.03 MB 28.21 MB 28.21 MB
Enumerator 4.15 MB 4.15 MB 60.43 MB 4.15 MB

Retained Memory by Top Classes

Class Master FF Disabled FF Enabled FF Removed
Hash 142.51 MB 142.51 MB 142.51 MB 142.51 MB
Collection::Item 71.20 MB 71.20 MB 28.10 MB 28.10 MB
Array 55.71 MB 55.71 MB 55.71 MB 55.71 MB
HashWithIndifferentAccess 38.71 MB 38.71 MB 38.71 MB 38.71 MB
String 18.37 MB 18.37 MB 18.37 MB 18.37 MB

Conclusion

The optimization delivers a ~10% reduction in total memory allocation and a ~12.5% reduction in retained memory when the feature flag is removed. The primary mechanism is eliminating redundant symbolize_keys calls and reducing Collection::Item object overhead. The feature flag itself introduces measurable overhead (~80 MB allocated), so removing it after validation is recommended.

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Furkan Ayhan

Merge request reports

Loading