Skip to content

wperf: support --timeoout in sampling

This patch adds support for --timeout for sampling. See https://linaro.atlassian.net/browse/WPERF-266.

This simple solution is aiming to add this support and keep around 1 second interval between each pmu_device::get_sample() calls. You can see below that this algorithm now for some corner cases (where timeout duration is not integer number) may give more sample fetches. But always no more than 2 samples per second. This should not choke sampling fetches from Kernel driver. Please also note that 2 samples/sec may me fast enough to capture samples that might be dropped otherwise.

Sampling time with timeout enabled:

See --timeout value vs how many samples we will fetch from the driver. We want to keep it around 1 sample / second for now (old setting before patch).

0.5 sec == get_sample() counts = 1, this gives 2.0 [samples/sec]
0.9 sec == get_sample() counts = 1, this gives 1.1 [samples/sec]
1.0 sec == get_sample() counts = 1, this gives 1.0 [samples/sec]
1.1 sec == get_sample() counts = 2, this gives 1.8 [samples/sec]
1.9 sec == get_sample() counts = 2, this gives 1.1 [samples/sec]
2.0 sec == get_sample() counts = 2, this gives 1.0 [samples/sec]
2.1 sec == get_sample() counts = 3, this gives 1.4 [samples/sec]
2.5 sec == get_sample() counts = 3, this gives 1.2 [samples/sec]
2.9 sec == get_sample() counts = 3, this gives 1.0 [samples/sec]
3.0 sec == get_sample() counts = 3, this gives 1.0 [samples/sec]
3.1 sec == get_sample() counts = 4, this gives 1.3 [samples/sec]
3.7 sec == get_sample() counts = 4, this gives 1.1 [samples/sec]
3.9 sec == get_sample() counts = 4, this gives 1.0 [samples/sec]
4.0 sec == get_sample() counts = 4, this gives 1.0 [samples/sec]
4.1 sec == get_sample() counts = 5, this gives 1.2 [samples/sec]
4.3 sec == get_sample() counts = 5, this gives 1.2 [samples/sec]

Testing

Tested with CPython sampling and >>>10**10**100 calculation:

>wperf sample -pe_file python_d.exe -e ld_spec:10000 -c 1 --timeout 3.5
base address of 'python_d.exe': 0x7ff6ba7b1270, runtime delta: 0x7ff57a7b0000
sampling ....e.. done!
======================== sample source: ld_spec, top 50 hot functions ========================
        overhead  count  symbol
        ========  =====  ======
           67.45    259  x_mul:python312_d.dll
           14.84     57  v_isub:python312_d.dll
            4.17     16  x_add:python312_d.dll
            4.17     16  _Py_atomic_load_32bit_impl:python312_d.dll
            3.12     12  v_iadd:python312_d.dll
            2.60     10  PyErr_CheckSignals:python312_d.dll
            0.78      3  unknown
            0.52      2  _Py_atomic_load_64bit_impl:python312_d.dll
            0.52      2  PyGILState_Check:python312_d.dll
            0.52      2  _PyMem_DebugCheckAddress:python312_d.dll
            0.26      1  write_size_t:python312_d.dll
            0.26      1  _PyErr_CheckSignalsTstate:python312_d.dll
            0.26      1  k_mul:python312_d.dll
            0.26      1  read_size_t:python312_d.dll
            0.26      1  _Py_NewReference:python312_d.dll
100.00%       384  top 15 in total
Edited by Przemyslaw Wirkus

Merge request reports