mdrun writes broken energy group values to .edr file - Redmine #1822
With 2 energy groups, mdrun -nb cpu
and mdrun -nb gpu
writes .edr
files such that gmxcheck -e cpu-run -e2 gpu-run
gives
There are 39 terms to compare in the energy files
Coulomb (SR) step 0: -11637.2, step 0: -15106.6
Potential step 0: -6509.07, step 0: -9978.4
Total Energy step 0: -6501.01, step 0: -9970.33
Coul-SR:URE-URE step 0: 9361.08, step 0: -15106.6
LJ-SR:URE-URE step 0: 336.99, step 0: 3585.22
Coul-SR:URE-SOL step 0: -3929.27, step 0: 0
LJ-SR:URE-SOL step 0: -168.613, step 0: 0
Coul-SR:SOL-SOL step 0: -17069, step 0: 0
LJ-SR:SOL-SOL step 0: 3416.84, step 0: 0
Coulomb (SR) step 1: -11657.9, step 1: -15127.2
Even with a single energy group, I get
Coulomb (SR) step 0: -11637.2, step 0: -15106.6
Potential step 0: -6509.07, step 0: -9978.41
Total Energy step 0: -6501.01, step 0: -9970.34
Coul-SR:URE-URE step 0: 9361.08, step 0: -15106.6
LJ-SR:URE-URE step 0: 336.99, step 0: 3585.22
Coul-SR:URE-rest step 0: -3929.27, step 0: 0
LJ-SR:URE-rest step 0: -168.613, step 0: 0
Coul-SR:rest-rest step 0: -17069, step 0: 0
LJ-SR:rest-rest step 0: 3416.84, step 0: 0
Coulomb (SR) step 1: -11657.9, step 1: -15127.2
Tarball with repro materials and output attached.
The GPU .log file does say “NOTE: With GPUs, reporting energy group contributions is not supported”. (In #1293 it was suggested we move/add such a comment near the end of the .log file. #1727 also misunderstood how to use the code)
Since energy groups are not supported on GPUs, we should not write an .edr file with energy groups, so that users cannot erroneously use the incorrect data they contain. If we’re unwilling to do that, then perhaps energy-analysis tools should have a check for “all the fields zero except the first”.
Frankly, there’s something to be said for only writing group-wise contributions during a rerun. (Our code is likely not agile enough to be able to call the energy-group kernels only on energy-output steps, so even on the CPU the small overhead of energy groups is being paid every MD ste) This would be slightly easier to do once we’ve removed the group scheme.
(from redmine: issue id 1822, created on 2015-09-15 by mark.j.abraham, closed on 2018-01-03)
- Relations:
- relates #1293 (closed)
- relates #1727 (closed)
- Changesets:
- Revision 4a4dc78e by Erik Lindahl on 2018-01-03T12:26:05Z:
Don't allow multiple energy groups for GPU runs
Exit with a fatal error instead of only warning, since the
latter leads to writing data for energy groups that
is incorrect to the energy file.
Fixes #1822.
Change-Id: I34ccb10bba6d6e1350283e34ebc908c6f830baab
- Uploads: