POC: Optimize CI Variables Collection and Item for reduced allocations
What does this MR do and why?
This change introduces performance optimizations to reduce object
allocations in the CI variable collection hot paths during pipeline
creation. The optimizations are behind the feature flag
ci_optimize_variables_collection_and_item.
Key optimizations:
- Skip
dupinItem.fabricatewhen input is already anItem - Skip
symbolize_keyswhen hash keys are already symbols - Reuse
Itemobjects directly inCollection#concatwhen source is aCollection - Reuse
Itemobjects inCollection#initializewhen initialized with existingItems
References
Related to #512876
Screenshots or screen recordings
Production data:
curl "https://gitlab.com/api/v4/projects/278964/merge_requests/225675/pipelines?performance_bar=flamegraph" \
-X POST \
-o flamegraph-225675-flamegraph.json \
-H 'accept: application/json' \
-H 'content-type: application/json' \
-H 'cookie: ---' \
-H 'x-csrf-token: ---'
==================================
Mode: wall(10100)
Samples: 2092 (53.88% miss rate)
GC: 377 (18.02%)
==================================
TOTAL (pct) SAMPLES (pct) FRAME
270 (12.9%) 270 (12.9%) (marking)
112 (5.4%) 112 (5.4%) PG::Connection#exec_params
109 (5.2%) 109 (5.2%) (sweeping)
69 (3.3%) 69 (3.3%) IO#wait_readable
60 (2.9%) 60 (2.9%) GRPC::Core::Call#run_batch
780 (37.3%) 56 (2.7%) Class#new
81 (3.9%) 52 (2.5%) Gitlab::Ci::Variables::Collection#initialize
135 (6.5%) 40 (1.9%) Gitlab::Ci::Pipeline::Expression::Lexer#tokenize
209 (10.0%) 39 (1.9%) Gitlab::Ci::Variables::Collection#append
231 (11.0%) 36 (1.7%) ActiveModel::EachValidator#validate
38 (1.8%) 30 (1.4%) Kernel#dup
25 (1.2%) 25 (1.2%) StringScanner#scan
72 (3.4%) 20 (1.0%) Enumerable#flat_map
40 (1.9%) 20 (1.0%) ActiveSupport::HashWithIndifferentAccess#[]=
20 (1.0%) 20 (1.0%) Hash#[]=
83 (4.0%) 18 (0.9%) Delegator#method_missing
297 (14.2%) 17 (0.8%) Array#map
17 (0.8%) 17 (0.8%) OpenSSL::KDF.pbkdf2_hmac
1659 (79.3%) 17 (0.8%) Array#each
107 (5.1%) 15 (0.7%) Gitlab::Ci::Variables::Collection::Item.fabricate
81 (3.9%) 14 (0.7%) Flipper::Feature#enabled?
14 (0.7%) 14 (0.7%) Gitlab::Ci::Variables::Collection::Item#initialize
17 (0.8%) 14 (0.7%) Delegator#target_respond_to?
403 (19.3%) 14 (0.7%) Kernel#tap
13 (0.6%) 13 (0.6%) Symbol#to_s
99 (4.7%) 13 (0.6%) Gitlab::Ci::Pipeline::Expression::Parser#tree
548 (26.2%) 13 (0.6%) Enumerable#find
13 (0.6%) 13 (0.6%) Regexp#match
284 (13.6%) 12 (0.6%) ActiveSupport::Callbacks::Filters::Before#call
11 (0.5%) 11 (0.5%) String#match?
Flamegraph profiling of pipeline creation revealed significant CPU time spent in variable handling:
-
(marking)(GC marking redundant objects) -
(sweeping)(GC sweeping redundant objects) Gitlab::Ci::Variables::Collection#appendGitlab::Ci::Variables::Collection#initializeGitlab::Ci::Variables::Collection::Item.fabricate
Benchmark
Test script
# RAILS_PROFILE=true GITALY_DISABLE_REQUEST_LIMITS=true rails console
require 'benchmark'
ActiveRecord::Base.logger = nil
@project = Project.find_by_full_path('gitlab-org/gitlab-clone')
@user = @project.first_owner
@merge_request = MergeRequest.find(153)
@merge_request_params = { allow_duplicate: true }
@n = 10
def run
Gitlab::SafeRequestStore.ensure_request_store do
pipeline = ::MergeRequests::CreatePipelineService.new(
project: @project, current_user: @user, params: @merge_request_params
).execute(@merge_request).payload
raise "stop!" unless pipeline.created_successfully?
end
end
# warm up
run
run
Benchmark.bm do |benchmark|
benchmark.report("nochange") do # nochange, old, and new
@n.times do
run
end
end
end
(@n + 2).times do
::Ci::DestroyPipelineService.new(@project, @user).execute(Ci::Pipeline.last)
end
Result
Ran with:
- No change, on master
user system total real
nochange 49.882477 0.787781 50.670258 ( 67.084736)
Feature.remove(:ci_optimize_variables_collection_and_item)
user system total real
old 52.515001 0.864138 53.379139 ( 67.952408)
Feature.enable(:ci_optimize_variables_collection_and_item)
user system total real
new 47.385361 0.808716 48.194077 ( 62.390174)
Summary of Benchmark
Per-pipeline improvement: ~470ms real time saved.
Note: Local test data has limited jobs, variables, etc. Production pipelines with more jobs and variables should see greater improvements due to reduced object allocations and GC pressure.
Memory
Method
# RAILS_PROFILE=true GITALY_DISABLE_REQUEST_LIMITS=true rails console
ActiveRecord::Base.logger = nil
@project = Project.find_by_full_path('gitlab-org/gitlab-clone')
@user = @project.first_owner
@merge_request = MergeRequest.find(153)
@merge_request_params = { allow_duplicate: true }
require 'memory_profiler'
def run
Gitlab::SafeRequestStore.ensure_request_store do
pipeline = ::MergeRequests::CreatePipelineService.new(
project: @project, current_user: @user, params: @merge_request_params
).execute(@merge_request).payload
raise "stop!" unless pipeline.created_successfully?
end
end
run
report = MemoryProfiler.report do
run
end
2.times do
Gitlab::SafeRequestStore.ensure_request_store do
::Ci::DestroyPipelineService.new(@project, @user).execute(Ci::Pipeline.last)
end
end
io = StringIO.new
report.pretty_print(io, detailed_report: true, scale_bytes: true, normalize_paths: true)
puts io.string
Overview
| Metric | Master | FF Disabled | FF Enabled | FF Removed | Delta (FF Removed vs Master) |
|---|---|---|---|---|---|
| Total allocated | 1.45 GB (14.27M obj) | 1.50 GB (15.66M obj) | 1.39 GB (13.77M obj) | 1.31 GB (12.94M obj) | -140 MB (-9.7%) |
| Total retained | 345.69 MB (2.88M obj) | 345.69 MB (2.88M obj) | 302.58 MB (2.34M obj) | 302.58 MB (2.34M obj) | -43.11 MB (-12.5%) |
Key Findings
1. FF disabled path adds overhead
With the feature flag disabled, total allocation increases by ~50 MB and ~1.4M objects compared to master. This is due to the feature flag check overhead and the structural changes in the code that still exist but fall back to the old behavior.
2. Retained memory reduction is significant
The 43 MB retained memory reduction (12.5%) is the most impactful metric; retained memory directly affects long-term memory growth of worker processes.
3. Removing the FF unlocks the full allocation savings
The FF-enabled path saves ~60 MB allocated vs master, but removing the FF entirely saves ~140 MB. The feature flag conditional branching and dual code paths cost ~80 MB in transient allocations.
4. Collection::Item is the biggest winner
| Metric | Master | FF Removed | Delta |
|---|---|---|---|
Retained memory (Collection::Item class) |
71.20 MB | 28.10 MB | -43.10 MB (-60%) |
Retained objects (Collection::Item class) |
890,010 | 351,199 | -538,811 (-60%) |
Allocated memory (item.rb file) |
277.13 MB | 199.28 MB | -77.85 MB (-28%) |
5. symbolize_keys elimination
activesupport/core_ext/hash/keys.rb dropped from 68.68 MB → 12.40 MB allocated (-56.28 MB, -82%). This confirms the elimination of massive symbolize_keys calls in the hot path.
6. Hash allocations reduced
| Metric | Master | FF Removed | Delta |
|---|---|---|---|
| Allocated Hash memory | 546.02 MB | 489.73 MB | -56.29 MB (-10%) |
| Allocated Hash objects | 2,680,978 | 2,329,225 | -351,753 (-13%) |
Allocated Memory by Top Files
| File | Master | FF Disabled | FF Enabled | FF Removed |
|---|---|---|---|---|
variables/collection/item.rb |
277.13 MB | 277.13 MB | 255.57 MB | 199.28 MB |
variables/collection.rb |
174.29 MB | 230.19 MB | 193.51 MB | 174.29 MB |
marginalia/comment.rb |
133.04 MB | 133.04 MB | 133.04 MB | 133.04 MB |
hash/keys.rb |
68.68 MB | 68.68 MB | 12.40 MB | 12.40 MB |
entry/configurable.rb |
57.82 MB | 57.82 MB | 57.82 MB | 57.82 MB |
Retained Memory by Top Files
| File | Master | FF Disabled | FF Enabled | FF Removed |
|---|---|---|---|---|
variables/collection.rb |
126.69 MB | 126.69 MB | 126.69 MB | 126.69 MB |
variables/collection/item.rb |
127.83 MB | 127.83 MB | 84.72 MB | 84.72 MB |
deep_dup.rb |
10.56 MB | 10.56 MB | 10.56 MB | 10.56 MB |
deep_mergeable.rb |
10.18 MB | 10.18 MB | 10.18 MB | 10.18 MB |
variables/helpers.rb |
8.18 MB | 8.18 MB | 8.18 MB | 8.18 MB |
Allocated Memory by Top Classes
| Class | Master | FF Disabled | FF Enabled | FF Removed |
|---|---|---|---|---|
Hash |
546.02 MB | 546.02 MB | 489.74 MB | 489.73 MB |
String |
273.55 MB | 273.53 MB | 273.53 MB | 273.53 MB |
Array |
235.43 MB | 291.34 MB | 254.64 MB | 235.42 MB |
Collection::Item |
106.03 MB | 106.03 MB | 28.21 MB | 28.21 MB |
Enumerator |
4.15 MB | 4.15 MB | 60.43 MB | 4.15 MB |
Retained Memory by Top Classes
| Class | Master | FF Disabled | FF Enabled | FF Removed |
|---|---|---|---|---|
Hash |
142.51 MB | 142.51 MB | 142.51 MB | 142.51 MB |
Collection::Item |
71.20 MB | 71.20 MB | 28.10 MB | 28.10 MB |
Array |
55.71 MB | 55.71 MB | 55.71 MB | 55.71 MB |
HashWithIndifferentAccess |
38.71 MB | 38.71 MB | 38.71 MB | 38.71 MB |
String |
18.37 MB | 18.37 MB | 18.37 MB | 18.37 MB |
Conclusion
The optimization delivers a ~10% reduction in total memory allocation and a ~12.5% reduction in retained memory when the feature flag is removed. The primary mechanism is eliminating redundant symbolize_keys calls and reducing Collection::Item object overhead. The feature flag itself introduces measurable overhead (~80 MB allocated), so removing it after validation is recommended.
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.