Statistics (StdDev, CfVar) present incorrect values
Describe the bug
Some statistics generated by hpcprof -M stats, specifically StdDev and CfVar, have incorrect values reported in the Viewer. This is after applying fixes on the hpcprof side to correct certain bad calculations that leaked through.
To Reproduce
Open the following database: hpctoolkit-x-database.zip
This data is collected a specially crafted program (x.c) with known performance properties across 3 threads (main, body1, and body2). The database above is processed with the WIP fixes from hpctoolkit!1365 (merged).
Looking at the columns for standard deviation (StdDev) and coefficient of variation (CfVar), you will find 6 values which are reproduced below:
| Scope | CPUTIME (s): StdDev (I) | CPUTIME (s): CfVar (I) |
|---|---|---|
| Experiment Aggregate Metrics | 2.47e+00 | 2.24e+00 |
| <thread root> | 2.82e+00 | 1.99e+00 |
| body2 | 3.02e+00 | 1.60e+00 |
Expected behavior
First, both columns should be zero/empty for body2. This function is called from a single thread, thus the StdDev and CfVar across threads is exactly zero.
Second, the values for the aggregate/root and <thread root> are incorrect. Calculating the values using Python's standard library gives values an order of magnitude smaller than are listed above:
>>> import statistics
>>> cputime_main = 0.4720730000000003
>>> cputime_body1 = 0.9445140000000003
>>> cputime_body2 = 1.8881890000000014
>>> cputime_thread_root = [cputime_body1, cputime_body2]
>>> cputime_aggregate = cputime_thread_root + [cputime_main]
>>> statistics.pstdev(cputime_aggregate) # == Aggregate StdDev
0.5886998414172262
>>> statistics.pstdev(cputime_aggregate) / statistics.mean(cputime_aggregate) # == Aggregate CfVar
0.5344082395453361
>>> statistics.pstdev(cputime_thread_root) # == <thread root> StdDev
0.4718375000000006
>>> statistics.pstdev(cputime_thread_root) / statistics.mean(cputime_thread_root) # == <thread root> CfVar
0.33313587764054353
Note that the summary values recorded in the profile.db are accurate for these contexts. The relevant details are in the collapsible section below:
Click to expand
The raw values in the profile.db can be extracted using hpctesttool yaml from hpctoolkit/hpctoolkit>:
profile: !profile.db/v4
profile_infos: !profile.db/v4/ProfileInfos
profiles:
- !profile.db/v4/Profile # {/} [is_summary]
id_tuple:
flags: !profile.db/v4/Profile.Flags [is_summary]
values:
0: # for <root>
0: 3.0 # for sum / '1' / point #0
1: 3.0 # for sum / '1' / function #1
2: 3.0 # for sum / '1' / lex_aware #2
3: 3.0 # for sum / '1' / execution #3
7: 3.304776000000002 # for sum / '$$' / execution #7
11: 4.680217313246006 # for sum / '($$^2)' / execution #11
15: 0.4720730000000003 # for min / '$$' / execution #15
19: 1.8881890000000014 # for max / '$$' / execution #19
# ...snip...
30: # for application thread (= application_thread) #30
0: 2.0 # for sum / '1' / point #0
1: 2.0 # for sum / '1' / function #1
2: 2.0 # for sum / '1' / lex_aware #2
3: 2.0 # for sum / '1' / execution #3
7: 2.8327030000000017 # for sum / '$$' / execution #7
11: 4.457364395917006 # for sum / '($$^2)' / execution #11
15: 0.9445140000000003 # for min / '$$' / execution #15
19: 1.8881890000000014 # for max / '$$' / execution #19
The accuracy of these values can be confirmed in Python:
>>> sum(cputime_aggregate) - 3.304776000000002 # for <root> for sum / '$$' / execution
0.0
>>> sum(cputime_thread_root) - 2.8327030000000017 # for thread for sum / '$$' / execution
0.0
>>> sum(x**2 for x in cputime_aggregate) - 4.680217313246006 # for <root> for sum / '($$^2)' / execution
8.881784197001252e-16
>>> sum(x**2 for x in cputime_thread_root) - 4.457364395917006 # for thread for sum / '($$^2)' / execution
0.0
Platform (please complete the following information):
- OS: Linux
- Architecture: x86-64
- Version:
main(hpcviewer@1d4c4313)
Additional Information
I'm not sure if the root cause is in the Viewer or hpctoolkit/database>, feel free to move this issue as needed.