Add support for collecting memory allocator statistics
What does this MR do and why?
The Ruby VM (as well as C-extensions of gems) use malloc
to satisfy requests for more memory, such as when growing the Ruby heap to store more objects. When deployed via Omnibus or Charts, we do not use the GNU libc allocator, but jemalloc
, an alternative malloc
implementation that aims to optimize memory allocations in multi-threaded environments. The choice and configuration of the memory allocator can substantially affect application performance, but also long-term memory growth and fragmentation. In order to improve insight into how it operates, this MR adds a new Ruby interface, which can be used to:
- Collect memory allocator statistics and return them as a string.
- Write memory allocator statistics to a file.
The output format is either JSON or a tabular format meant to be human-readable. In this MR we are merely adding the basic implementation for this, the data is not yet collected anywhere.
We do not produce these stats ourselves; instead, we use a C function in the allocator library, malloc_stats_print
. Since gitlab is a Ruby program, we need a bridge to make this call into C-land. Ruby ships with Fiddle
, which in turn is based on libffi
to do exactly this. It is a two-way bridge between Ruby and C invocations.
So at a high level, what this MR does is:
- Map the C-call to
malloc_stats_print
to a Ruby function. - Since
malloc_stats_print
outputs tostderr
by default, which is of limited use to us, we intercept its output buffer through a Ruby closure. - Finally, we return or write the output string collected this way to a file.
Risk & performance
These reports are not yet collected automatically or even available from outside the application. One must invoke these functions directly e.g. via rbtrace
, or integrate them with e.g. an API endpoint or a signal handler to produce them. This means there is no immediate risk with deploying this change. In &8105 we are looking for ways to make this available in a safe manner.
Other considerations:
-
What if jemalloc is not used? Whenever
libjemalloc.so
is not onLD_PRELOAD
(i.e. GitLab is not using it), these functions are no-ops and returnnil
. -
How do these calls affect performance? Fiddle is a libffi wrapper. For the libffi function call, Ruby releases the GVL. This means we won't be blocking other Ruby threads for the duration of the native call into
malloc_stats_print
. Any time spent at the Ruby VM level will require the GVL, however.
As far as runtime goes, it is more interesting to look at the JSON report, since it is much larger. On my Thinkpad X1, it takes about 750ms to dump it to a file, though this is against an idle development system:
git@b23df5e55262:~/gitlab$ bundle exec rbtrace -p $(pgrep -f 'worker 0') -e 'Benchmark.bmbm { |x| x.report { Gitlab::Memory::Jemalloc.dump_stats(path: "/tmp", format: :json) } }'
*** run `sudo sysctl kernel.msgmnb=1048576` to prevent losing events (currently: 16384 bytes)
*** attached to process 191
>> Benchmark.bmbm { |x| x.report { Gitlab::Memory::Jemalloc.dump_stats(path: "/tmp", format: :json) } }
=> [#<Benchmark::Tms:0x00007f88990a5450 @label="", @real=0.7750348010013113, @cstime=0.0, @cutime=0.0, @stime=0.025372000000000172, @utime=0.7467410000000001, @total=0.7721130000000003>]
*** detached from process 191
Logs:
web_1 | Rehearsal ------------------------------------
web_1 | 0.754279 0.023787 0.778066 ( 0.779778)
web_1 | --------------------------- total: 0.778066sec
web_1 |
web_1 | user system total real
web_1 | 0.750821 0.012818 0.763639 ( 0.766589)
Only in production will we be able to get meaningful data for this, but I think the ballpark here is "similar to a slow endpoint request".
Screenshots or screen recordings
Sample output (JSON): jemalloc_stats.json
To test this, libjemalloc
must be on LD_PRELOAD
.
Produce via:
[39] pry(main)> Gitlab::Memory::Jemalloc.dump_stats(path: '/tmp')
or via rbtrace
for a Puma worker:
$ bundle exec rbtrace -p $(pgrep -f 'worker 0') -e 'pp Gitlab::Memory::Jemalloc.dump_stats(path: "/tmp")'
$ bundle exec rbtrace -p $(pgrep -f 'worker 0') -e 'pp Gitlab::Memory::Jemalloc.dump_stats(path: "/tmp", format: :text)'
-rw-r--r--. 1 git git 1.1M Jun 13 07:32 jemalloc_stats.191.1655105543.json
-rw-r--r--. 1 git git 372K Jun 13 07:33 jemalloc_stats.191.1655105601.txt
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #364346 (closed)