Skip to content

Reduce memory allocations when iterating over an Enumerator

Stan Hu requested to merge sh-reduce-enum-memory-allocations into master

What does this MR do and why?

!88882 (merged) introduced a monkey patch for Enumeration#next to improve stack traces for Gitaly calls due to https://bugs.ruby-lang.org/issues/16829.

This patch relies on checking that gitlab_patch_backtrace_marker is present in the stack trace. However, for a Rust extension like prometheus-client-mmap, the Ruby interpreter omits the details of the calls made inside the extension. For example, in omnibus-gitlab#8373 (comment 1748497473), we see StopIteration is raised when the extension iterates through all strings in its WeakMap, but the backtrace does not contain the expected signature since Enumerator#next is called from the Rust code. As a result, the backtrace would always be unnecessarily appended to the StopIteration exception, resulting in millions of wasted memory allocations.

This commit reduces memory allocations in two ways:

  1. Use a constant for the regular expression. This eliminates the need to call parse_regexp in the Ruby interpreter repeatedly each time Enumerator#next is called.

  2. Don't append stack traces for StopIteration. A Enumerator raises StopIteration when the end is reached, and Kernel#loop rescues this exception. Previously a lot of care went into ensuring that nested enums preserved the full backtrace when StopIteration occurred, but this generally isn't needed for debugging. If a C or Rust extension iterates through an array repeatedly, we saw millions of temporary objects being created to build the complete stack trace for no real use.

Relates to #439920 (closed)

MR acceptance checklist

How to set up and validate locally

To ensure the complete stack trace was still present when a Gitaly timeout occurred (!88882 (comment 994128140)):

  1. I hacked Gitaly to take a long time to return the results of a diff:
diff --git a/internal/gitaly/service/diff/commit_diff.go b/internal/gitaly/service/diff/commit_diff.go
index c9d97e7eb..a2f71a3b5 100644
--- a/internal/gitaly/service/diff/commit_diff.go
+++ b/internal/gitaly/service/diff/commit_diff.go
@@ -2,6 +2,7 @@ package diff
 
 import (
 	"fmt"
+	"time"
 
 	"gitlab.com/gitlab-org/gitaly/v16/internal/git"
 	"gitlab.com/gitlab-org/gitaly/v16/internal/gitaly/diff"
@@ -13,6 +14,8 @@ import (
 func (s *server) CommitDiff(in *gitalypb.CommitDiffRequest, stream gitalypb.DiffService_CommitDiffServer) error {
 	ctx := stream.Context()
 
+	time.Sleep(120 * time.Minute)
+
 	s.logger.WithFields(log.Fields{
 		"LeftCommitId":  in.LeftCommitId,
 		"RightCommitId": in.RightCommitId,
  1. gdk restart gitaly

  2. I hacked my Gitaly client to print the full backtrace of the error:

diff --git a/lib/gitlab/gitaly_client/call.rb b/lib/gitlab/gitaly_client/call.rb
index 37d3921d6d56..b3b7dd4065ed 100644
--- a/lib/gitlab/gitaly_client/call.rb
+++ b/lib/gitlab/gitaly_client/call.rb
@@ -48,6 +48,7 @@ def instrument_stream(response)
           end
         rescue ::GRPC::BadStatus => err
           set_gitaly_error_metadata(err)
+          puts err.backtrace.join("\n")
           raise err
         ensure
           store_timings
  1. Then I ran bin/rails console and ran:
Project.first.commit.diffs.diff_files

This resulted in:

/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/grpc-1.60.0-arm64-darwin/src/ruby/lib/grpc/generic/active_call.rb:29:in `check_status'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/grpc-1.60.0-arm64-darwin/src/ruby/lib/grpc/generic/active_call.rb:186:in `attach_status_results_and_complete_call'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/grpc-1.60.0-arm64-darwin/src/ruby/lib/grpc/generic/active_call.rb:175:in `receive_and_check_status'
/Users/stanhu/gdk-ee/gitlab/config/initializers/grpc_patch.rb:21:in `receive_and_check_status'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/grpc-1.60.0-arm64-darwin/src/ruby/lib/grpc/generic/active_call.rb:338:in `each_remote_read_then_finish'
bin/rails:in `each'
/Users/stanhu/gdk-ee/gitlab/config/initializers/enumerator_next_patch.rb:11:in `block (2 levels) in <module:EnumeratorNextPatch>'
/Users/stanhu/gdk-ee/gitlab/lib/gitlab/gitaly_client/call.rb:45:in `block (3 levels) in instrument_stream'
/Users/stanhu/gdk-ee/gitlab/lib/gitlab/gitaly_client/call.rb:61:in `recording_request'
/Users/stanhu/gdk-ee/gitlab/lib/gitlab/gitaly_client/call.rb:45:in `block (2 levels) in instrument_stream'
/Users/stanhu/gdk-ee/gitlab/lib/gitlab/gitaly_client/call.rb:44:in `loop'
/Users/stanhu/gdk-ee/gitlab/lib/gitlab/gitaly_client/call.rb:44:in `block in instrument_stream'
/Users/stanhu/gdk-ee/gitlab/lib/gitlab/gitaly_client/diff_stitcher.rb:17:in `each'
/Users/stanhu/gdk-ee/gitlab/lib/gitlab/gitaly_client/diff_stitcher.rb:17:in `each'
/Users/stanhu/gdk-ee/gitlab/lib/gitlab/gitaly_client/diff_stitcher.rb:17:in `each'
/Users/stanhu/gdk-ee/gitlab/lib/gitlab/git/diff_collection.rb:168:in `each_gitaly_patch'
/Users/stanhu/gdk-ee/gitlab/lib/gitlab/git/diff_collection.rb:59:in `each'
/Users/stanhu/gdk-ee/gitlab/lib/gitlab/git/diff_collection.rb:131:in `each_with_index'
/Users/stanhu/gdk-ee/gitlab/lib/gitlab/git/diff_collection.rb:131:in `decorate!'
/Users/stanhu/gdk-ee/gitlab/lib/gitlab/diff/file_collection/base.rb:41:in `block in raw_diff_files'
/Users/stanhu/gdk-ee/gitlab/gems/gitlab-utils/lib/gitlab/utils/strong_memoize.rb:34:in `strong_memoize'
/Users/stanhu/gdk-ee/gitlab/lib/gitlab/diff/file_collection/base.rb:40:in `raw_diff_files'
/Users/stanhu/gdk-ee/gitlab/lib/gitlab/diff/file_collection/base.rb:36:in `diff_files'
(pry):1:in `__pry__'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/pry-0.14.2/lib/pry/pry_instance.rb:290:in `eval'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/pry-0.14.2/lib/pry/pry_instance.rb:290:in `evaluate_ruby'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/pry-0.14.2/lib/pry/pry_instance.rb:659:in `handle_line'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/pry-0.14.2/lib/pry/pry_instance.rb:261:in `block (2 levels) in eval'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/pry-0.14.2/lib/pry/pry_instance.rb:260:in `catch'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/pry-0.14.2/lib/pry/pry_instance.rb:260:in `block in eval'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/pry-0.14.2/lib/pry/pry_instance.rb:259:in `catch'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/pry-0.14.2/lib/pry/pry_instance.rb:259:in `eval'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/pry-0.14.2/lib/pry/repl.rb:77:in `block in repl'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/pry-0.14.2/lib/pry/repl.rb:67:in `loop'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/pry-0.14.2/lib/pry/repl.rb:67:in `repl'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/pry-0.14.2/lib/pry/repl.rb:38:in `block in start'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/pry-0.14.2/lib/pry/input_lock.rb:61:in `__with_ownership'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/pry-0.14.2/lib/pry/input_lock.rb:78:in `with_ownership'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/pry-0.14.2/lib/pry/repl.rb:38:in `start'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/pry-0.14.2/lib/pry/repl.rb:15:in `start'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/pry-byebug-3.10.1/lib/pry-byebug/pry_ext.rb:15:in `start_with_pry_byebug'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/pry-shell-0.6.4/lib/pry/shell/patches/pry_byebug.rb:67:in `start_with_pry_byebug'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/pry-0.14.2/lib/pry/pry_class.rb:194:in `start'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/railties-7.0.8/lib/rails/commands/console/console_command.rb:74:in `start'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/railties-7.0.8/lib/rails/commands/console/console_command.rb:19:in `start'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/railties-7.0.8/lib/rails/commands/console/console_command.rb:106:in `perform'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/thor-1.3.0/lib/thor/command.rb:28:in `run'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/thor-1.3.0/lib/thor/invocation.rb:127:in `invoke_command'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/thor-1.3.0/lib/thor.rb:527:in `dispatch'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/railties-7.0.8/lib/rails/command/base.rb:87:in `perform'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/railties-7.0.8/lib/rails/command.rb:48:in `invoke'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/railties-7.0.8/lib/rails/commands.rb:18:in `<main>'
<internal:/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/site_ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:37:in `require'
<internal:/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/site_ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:37:in `require'
/Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/bootsnap-1.17.1/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:30:in `require'
bin/rails:4:in `<main>'
GRPC::DeadlineExceeded: 4:Deadline Exceeded. debug_error_string:{UNKNOWN:Error received from peer  {created_time:"2024-01-31T14:38:49.007493-08:00", grpc_status:4, grpc_message:"Deadline Exceeded"}}
from /Users/stanhu/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/grpc-1.60.0-arm64-darwin/src/ruby/lib/grpc/generic/active_call.rb:29:in `check_status'
Edited by Stan Hu

Merge request reports