mdrun writes broken energy group values to .edr file - Redmine #1822
With 2 energy groups, `mdrun -nb cpu` and `mdrun -nb gpu` writes .edr files such that `gmxcheck -e cpu-run -e2 gpu-run` gives There are 39 terms to compare in the energy files Coulomb (SR) step 0: -11637.2, step 0: -15106.6 Potential step 0: -6509.07, step 0: -9978.4 Total Energy step 0: -6501.01, step 0: -9970.33 Coul-SR:URE-URE step 0: 9361.08, step 0: -15106.6 LJ-SR:URE-URE step 0: 336.99, step 0: 3585.22 Coul-SR:URE-SOL step 0: -3929.27, step 0: 0 LJ-SR:URE-SOL step 0: -168.613, step 0: 0 Coul-SR:SOL-SOL step 0: -17069, step 0: 0 LJ-SR:SOL-SOL step 0: 3416.84, step 0: 0 Coulomb (SR) step 1: -11657.9, step 1: -15127.2 Even with a single energy group, I get Coulomb (SR) step 0: -11637.2, step 0: -15106.6 Potential step 0: -6509.07, step 0: -9978.41 Total Energy step 0: -6501.01, step 0: -9970.34 Coul-SR:URE-URE step 0: 9361.08, step 0: -15106.6 LJ-SR:URE-URE step 0: 336.99, step 0: 3585.22 Coul-SR:URE-rest step 0: -3929.27, step 0: 0 LJ-SR:URE-rest step 0: -168.613, step 0: 0 Coul-SR:rest-rest step 0: -17069, step 0: 0 LJ-SR:rest-rest step 0: 3416.84, step 0: 0 Coulomb (SR) step 1: -11657.9, step 1: -15127.2 Tarball with repro materials and output attached. The GPU .log file does say “NOTE: With GPUs, reporting energy group contributions is not supported”. (In \#1293 it was suggested we move/add such a comment near the end of the .log file. \#1727 also misunderstood how to use the code) Since energy groups are not supported on GPUs, we should not write an .edr file with energy groups, so that users cannot erroneously use the incorrect data they contain. If we’re unwilling to do that, then perhaps energy-analysis tools should have a check for “all the fields zero except the first”. Frankly, there’s something to be said for only writing group-wise contributions during a rerun. (Our code is likely not agile enough to be able to call the energy-group kernels only on energy-output steps, so even on the CPU the small overhead of energy groups is being paid every MD ste) This would be slightly easier to do once we’ve removed the group scheme. *(from redmine: issue id 1822, created on 2015-09-15 by mark.j.abraham, closed on 2018-01-03)* * Relations: * relates #1293 * relates #1727 * Changesets: * Revision 4a4dc78e0c059ed662ae29331fd4a6c2ad6278a2 by Erik Lindahl on 2018-01-03T12:26:05Z: ``` Don't allow multiple energy groups for GPU runs Exit with a fatal error instead of only warning, since the latter leads to writing data for energy groups that is incorrect to the energy file. Fixes #1822. Change-Id: I34ccb10bba6d6e1350283e34ebc908c6f830baab ``` * Uploads: * [energy-groups-issue.tgz](/uploads/b17c711f7aea052d3f4903d7436db80a/energy-groups-issue.tgz)
issue