shell code issues (broken with DD+grid; broken with Verlet scheme) - Redmine #1429
Removing particle decomposition requires that the regression test perl script stops always triggering -pd when simple grid searching is being used. Currently, the complex/sw test uses the shell-model polarization code, with simple, thus -pd (and vsites and “flexible” constraints). This would have to change to grid+DD when we remove PD.
Using installs of either master or release-4-6 HEAD with
gmxtest.pl complex -only sw\\b
by changing the grompmdp and/or gmxtest.pl files, I observed:
-pd and simple passes the test with 1-8 tMPI ranks
-pd and grid passes the test with 1-8 tMPI ranks
-dd and simple passes the test with 1 rank, and gives the expected fatal
error with DD+simple unsupported (perl script hack required to remove
the hard-coded -pd flag)
-dd and grid passes the test with 1 rank, and fails with all higher
numbers
The fails are pretty subtle:
comparing energy file reference_s.edr and ener.edr
There are 31 terms in the energy files
There are 5 terms to compare in the energy files
Angle step 19: 71.2642, step 19: 71.1485
Angle step 20: 70.2909, step 20: 70.1638
Files read successfully
So, if there’s a problem, then I think it is already present in the code and release-4-6 should get a fatal error for shells + DD and ranks >
- It could have the same cause as the problem that was swept under the carpet in #993 (the forces may not be very reproducible in single precision), but when I tested release-4-6 in double precision I got the same error with DD, grid, and -ntmpi >1.
To see if OpenMP was useful as parallelism, I tried to test the shell code with the Verlet scheme (using verlet-buffer-drift=–1 because the test case has a box that’s only just large enough for the unmodified rlist), and it segfaulted with both grid and simple, even with just 1 rank and 1 thread. So I believe that needs a fix or a fatal error, also.
The above poses a problem for the proposed removal of PD - the existence of what might be an historical bug reduces the support in 5.0 for shells from “MPI with group scheme and PD” to “serial with group scheme.” We have a few open bugs for / suspicions of the shell code (#879, #993, #713). I suggest we proceed with removing PD, and require the complex/sw test to run with 1 rank, unless/until someone wants to find out what the real problem is. Hopefully, with the interest in polarizable models expressed in #1292 that will be sooner rather than later!
(from redmine: issue id 1429, created on 2014-01-31 by mark.j.abraham, closed on 2014-06-13)
- Relations:
- parent #1292 (closed)
- Changesets:
- Revision a4c5fb33 by Mark Abraham on 2014-02-08T20:45:25Z:
Issue fatal errors rather than use broken shell code
Refs #1429
Change-Id: I18a17f1e232a86a13f4e3b591bd992702af3017b
- Revision e9e67f538cb76b9bed579643919b823b74bbd055 by Mark Abraham on 2014-02-19T19:27:02Z:
Fix sw test case
The former version had trouble reproducing the serial results in some
cases in parallel. This change replaces the input conf.gro file with
the confout.gro file from the former version of the test run in
Reference mode, because that starting configuration seems more
stable. The former input .gro file had velocities for particles named
DW, but it is not known whether this is related to the problem.
Also switched it to use ns-type=grid in line with future changes in
this area.
Refs #1429
Change-Id: Ib0e47fc2162f1e986e6d9d70d74c4aeb8fc72569
- Revision c949a5ac by Mark Abraham on 2014-02-21T00:44:46Z:
Reinstate shell code with DD
Further work on the complex/sw test case in the 5.0 regressiontests
branch reveals that the initial conditions may have been the reason
for the problems observed with DD and more than one node, rather than
the implementation.
Refs #1429
Change-Id: I26ff6d9f8c79605afa794cae4761b5643b712124
- Revision debd482d by Berk Hess on 2014-06-13T06:32:25Z:
Made shells work with the Verlet scheme
The issue was that atoms iso charge group should be put in the box.
Additionally water_pol only worked within a charge group, now fixed.
Note that with shells working across charge groups, which is always
the case with the Verlet scheme, no shell prediction is done.
We should implement prediction, since it improves performance.
Fixes #1429.
Change-Id: I2ebfc2d91fc161167f8f2573b61e1f519cf11fd8
- Revision 0b396d5f by Berk Hess on 2014-06-20T07:17:03Z:
Fixed shells with particle decomposition
In the spirit of version 5.0, the recent Verlet scheme fix for shells,
commit debd482d, removed the support for PD. Now it's back.
Refs #1429.
Change-Id: I686d4287ec8946e418aa98e739a1a81a0b7f7055