This patch adds support for --timeout
for sampling. See https://linaro.atlassian.net/browse/WPERF-266.
This simple solution is aiming to add this support and keep around 1 second interval between each pmu_device::get_sample()
calls.
You can see below that this algorithm now for some corner cases (where timeout duration is not integer number) may give more sample fetches. But always no more than 2 samples per second. This should not choke sampling fetches from Kernel driver.
Please also note that 2 samples/sec may me fast enough to capture samples that might be dropped otherwise.
See --timeout
value vs how many samples we will fetch from the driver. We want to keep it around 1 sample / second for now (old setting before patch).
0.5 sec == get_sample() counts = 1, this gives 2.0 [samples/sec]
0.9 sec == get_sample() counts = 1, this gives 1.1 [samples/sec]
1.0 sec == get_sample() counts = 1, this gives 1.0 [samples/sec]
1.1 sec == get_sample() counts = 2, this gives 1.8 [samples/sec]
1.9 sec == get_sample() counts = 2, this gives 1.1 [samples/sec]
2.0 sec == get_sample() counts = 2, this gives 1.0 [samples/sec]
2.1 sec == get_sample() counts = 3, this gives 1.4 [samples/sec]
2.5 sec == get_sample() counts = 3, this gives 1.2 [samples/sec]
2.9 sec == get_sample() counts = 3, this gives 1.0 [samples/sec]
3.0 sec == get_sample() counts = 3, this gives 1.0 [samples/sec]
3.1 sec == get_sample() counts = 4, this gives 1.3 [samples/sec]
3.7 sec == get_sample() counts = 4, this gives 1.1 [samples/sec]
3.9 sec == get_sample() counts = 4, this gives 1.0 [samples/sec]
4.0 sec == get_sample() counts = 4, this gives 1.0 [samples/sec]
4.1 sec == get_sample() counts = 5, this gives 1.2 [samples/sec]
4.3 sec == get_sample() counts = 5, this gives 1.2 [samples/sec]
Tested with CPython sampling and >>>10**10**100
calculation:
>wperf sample -pe_file python_d.exe -e ld_spec:10000 -c 1 --timeout 3.5
base address of 'python_d.exe': 0x7ff6ba7b1270, runtime delta: 0x7ff57a7b0000
sampling ....e.. done!
======================== sample source: ld_spec, top 50 hot functions ========================
overhead count symbol
======== ===== ======
67.45 259 x_mul:python312_d.dll
14.84 57 v_isub:python312_d.dll
4.17 16 x_add:python312_d.dll
4.17 16 _Py_atomic_load_32bit_impl:python312_d.dll
3.12 12 v_iadd:python312_d.dll
2.60 10 PyErr_CheckSignals:python312_d.dll
0.78 3 unknown
0.52 2 _Py_atomic_load_64bit_impl:python312_d.dll
0.52 2 PyGILState_Check:python312_d.dll
0.52 2 _PyMem_DebugCheckAddress:python312_d.dll
0.26 1 write_size_t:python312_d.dll
0.26 1 _PyErr_CheckSignalsTstate:python312_d.dll
0.26 1 k_mul:python312_d.dll
0.26 1 read_size_t:python312_d.dll
0.26 1 _Py_NewReference:python312_d.dll
100.00% 384 top 15 in total