Measure per-request/per-worker memory allocations statistics
Problem
Currently, we scrape amount of memory allocations for all work being executed. This does not provide a granularity needed to understand per-context memory allocations (per-request or per-worker execution).
This is due that a globally sampled stats are affected by all threads being executed.
Idea
We should have a predictable way to understand how much memory is being allocated by each requests/worker during its execution to understand:
- GC pressure on heap slots allocations
- a number of
malloc
calls (like for Strings) - a size of
malloc
calls (like processing large blobs of data) - allocations per second of execution
- allocation size per second of execution
- (maybe) histogram of memory allocations
Requirements
- be thread-safe and measure only with a context of a given execution
- log all counters to allow them easily to scrape using ELK
Solution
- Extend Ruby VM, and try to upstream it to provide an ability to measure allocations done in a given thread.
- Expose these counters as part of our logs to be able to scrape it.
- Analise data
- Provide a dashboards taking into account
per-request
andper-feature_category
information about allocations per-unit of execution or per-unit of time
What needs to be done?
1. Harder and longer
-
Get the Ruby VM patch be merged upstream and wait for us to update to Ruby 3.1 or 3.2 -
Update GitLab Rails to use a patch and log data
2. Likely easier, but requiring to update our components (choosen path)
-
Backport the upstream patch to Ruby 2.7: Done here https://github.com/ayufan-research/ruby/tree/thread-memory-stat-2.7 -
Patch Ruby shipped with Omnibus: omnibus-gitlab!4948 (merged) -
Patch Ruby shipped with CNG: gitlab-org/build/CNG!591 (merged) -
Patch Ruby used by GitLab CI for testing: gitlab-build-images!355 (merged) -
Patch Ruby used by GCK: gitlab-compose-kit!149 (merged) (maybe this is not needed now if we have Ruby for tests updated, and Ruby changes validated by CI) -
Patch Ruby used by GDK: gitlab-development-kit!1812 (merged) (maybe this is not needed now if we have Ruby for tests updated, and Ruby changes validated by CI) -
Update GitLab CI testing of GitLab to use patched RubyVM: !53226 (merged) -
Update GitLab Rails to use a patch and log data: !52306 (merged)
Summary
I decided to take a road of manually patching our stack for time being, and hopefully getting this merged upstream. Assuming that this gets merged and we update to Ruby 3.1 we could have that supported out of box and just drop the patch.
Edited by Kamil Trzciński