Vendor cloud_profiler_agent gem
What does this MR do and why?
Vendors https://github.com/remind101/ruby-cloud-profiler
I am following suggestions from https://docs.gitlab.com/ee/development/gemfile.html#gitlab-created-gems
Related to #393407 (closed)
Our goals with this MR and further:
- enable Google Cloud Profiling for
gstg
by merging this MR and setting ENV vars: MR - observe the data it will produce: is it readable? does it look trustworthy?
- observe if there are any potential issues, check the telemetry
- analysis of the
gstg
-web
profiles, discussion with the team
- from this point, we should decide if we want similar for
gstg
-sidekiq
(will need SRE support - see John's comment) - ..or enabling for
gprd
-web
- ..or refactoring the code (we mostly vendored it as-is, with some in-place fixes), and potentially extracting it (MR to upstream? our own gem? inject into labkit? keep vendored?)
Notes
Why pipeline:skip-undercoverage : !111142 (comment 1297511692)
TODOs
-
OK: We need to audit the gem's dependencies for Ruby 3 compatibility -
OK: Security review: https://gitlab.com/gitlab-com/gl-security/appsec/appsec-reviews/-/issues/191 -
OK (we will go with web
for now) SRE request to obtain service accounts: https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/17492 -
(After merging this MR) k8s-workloads MR to enable ENV vars on gstg
: gitlab-com/gl-infra/k8s-workloads/gitlab-com!2597 (merged)
What is changed from the original source
- The gem required
'googleauth', '~> 0.14'
(source), while we have1.3
in GitLab. I updated the gemspec in the vendored gem, it seems to be fine with the updated version. - I added some simple logging in a separate commit. It could be useful to explore during our testing on
gstg
/cny
. You could check it intail -f log/application_json.log
. In dev, it could look like this:
{"severity":"INFO","time":"2023-02-07T07:56:50.220Z","correlation_id":null,"gcp_ruby_status":"profile resource created","duration_s":34.08065799999895,"cpu_s":0.028377624999999997,"message":"Google Cloud Profiler Ruby","pid":89847,"worker_id":"puma_1"}
{"severity":"INFO","time":"2023-02-07T07:57:00.223Z","correlation_id":null,"gcp_ruby_status":"stackprof run finished","duration_s":10.002543000002333,"cpu_s":0.0006010000000000043,"message":"Google Cloud Profiler Ruby","pid":89847,"worker_id":"puma_1"}
{"severity":"INFO","time":"2023-02-07T07:57:00.329Z","correlation_id":null,"gcp_ruby_status":"stackprof to pprof converted","duration_s":0.10562799999752315,"cpu_s":0.10315308399999999,"message":"Google Cloud Profiler Ruby","pid":89847,"worker_id":"puma_1"}
{"severity":"INFO","time":"2023-02-07T07:57:00.960Z","correlation_id":null,"gcp_ruby_status":"profile resource updated","duration_s":0.630488000002515,"cpu_s":0.014009458000000002,"message":"Google Cloud Profiler Ruby","pid":89847,"worker_id":"puma_1"}
{"severity":"INFO","time":"2023-02-07T07:57:12.380Z","correlation_id":null,"gcp_ruby_status":"profile resource created","duration_s":56.24131099999795,"cpu_s":0.033467250000000004,"message":"Google Cloud Profiler Ruby","pid":89846,"worker_id":"puma_0"}
{"severity":"INFO","time":"2023-02-07T07:57:22.387Z","correlation_id":null,"gcp_ruby_status":"stackprof run finished","duration_s":10.006690000001981,"cpu_s":0.0015724170000000065,"message":"Google Cloud Profiler Ruby","pid":89846,"worker_id":"puma_0"}
{"severity":"INFO","time":"2023-02-07T07:57:22.489Z","correlation_id":null,"gcp_ruby_status":"stackprof to pprof converted","duration_s":0.10262900000088848,"cpu_s":0.10158741699999999,"message":"Google Cloud Profiler Ruby","pid":89846,"worker_id":"puma_0"}
{"severity":"INFO","time":"2023-02-07T07:57:23.057Z","correlation_id":null,"gcp_ruby_status":"profile resource updated","duration_s":0.5671220000003814,"cpu_s":0.011035624999999993,"message":"Google Cloud Profiler Ruby","pid":89846,"worker_id":"puma_0"}
- Default profiler frequency: comment - done: commit
- Some methods and variables were renamed to improve readability
How to set up and validate locally
- Make sure you have a project in Sandbox Cloud Realm
- Follow https://github.com/googleapis/google-cloud-ruby/tree/main/google-cloud-profiler-v2#before-you-begin, you need to Enable the API
- Follow the auth instructions in https://github.com/googleapis/google-cloud-ruby/blob/main/google-cloud-profiler-v2/AUTHENTICATION.md, you'll need a JSON key, and set
PROFILER_CREDENTIALS
ENV var pointing to this file. - Set
GITLAB_GOOGLE_CLOUD_PROFILER_ENABLED
to true - Set
GCP_PROFILER_PROJECT_ID
ENV var with the name of your Google Cloud Platform project - Restart the server:
gdk restart rails-web
- After some time (you could give it 10 minutes to be sure), search for "Profiler" in the search bar of your GCP console. It should suggest the service page, and open it
- Make sure you see the flame graphs here ingested from your local instance:
- If you need to repeat the experiment, it is suggested you replace
service:
key with something different, so you could easily verify you are receiving the profiler from your recent version (otherwise you need to memorize the amount of profiles sent and make sure it increased). You could switch between "services" profiles via GCP Profiler UI dropdown. You may need to refresh the page or press the "NOW" button in the time filter, to make it appear.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Edited by Aleksei Lipniagov