Fix flamegraph capture tools for GKE nodes
The flamegraph generation step is now stalling in the perf profiling scripts documented in https://gitlab.com/gitlab-com/runbooks/-/blob/master/docs/kube/k8s-adhoc-observability.md. This makes the tool harder to use, since the user must manually run the last step. Find the problem and fix it.
Example of the problem:
msmiley@gke-gprd-us-east1-c-generic-1-d5dc3c5d-fqb2 ~/runbooks/scripts/gke $ bash perf_flamegraph_for_container_of_pid.sh 2872357
...
Container msmiley-gcr.iocos-cloudtoolbox-v20220722 exited successfully.
Target PID 2872357 belongs to cgroup: /kubepods/burstable/pod430eb98a-8da4-4a68-8aa1-9376d13fd3f0/670bf353570dbb92be7191812985ec32f8709a9e5445ce2cd6662befb1ef8462
Starting capture for 60 seconds.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.037 MB perf.data (13494 samples) ]
Spawning container msmiley-gcr.iocos-cloudtoolbox-v20220722 on /var/lib/toolbox/msmiley-gcr.io_cos-cloud_toolbox-v20220722.
Press ^] three times within 1s to kill container.
^C
Container msmiley-gcr.iocos-cloudtoolbox-v20220722 terminated by signal KILL.
The profiling completes, but the flamegraph presentation does not:
msmiley@gke-gprd-us-east1-c-generic-1-d5dc3c5d-fqb2 ~/runbooks/scripts/gke $ ls -l /tmp/perf-record-results.sq7U5opK/
total 2544
-rw-r--r-- 1 msmiley msmiley 1 Nov 30 22:20 gke-gprd-us-east1-c-generic-1-d5dc3c5d-fqb2.20221130_221218_UTC.container_of_pid_2872357.flamegraph.svg
-rw-r--r-- 1 msmiley msmiley 439979 Nov 30 22:13 gke-gprd-us-east1-c-generic-1-d5dc3c5d-fqb2.20221130_221218_UTC.container_of_pid_2872357.perf-script.txt.gz
-rw------- 1 root root 2157768 Nov 30 22:13 perf.data
Edited by Matt Smiley