Thread-MPI error in GROMACS-2018 - Redmine #2540
Archive from user: Siva Dasetty Hello, I have come across an error that causes GROMACS (2018/2018.1) to crash. The message is: “tMPI error: Receive buffer size too small for transmission (in valid comm) Aborted” The error seems to only occur immediately following a LINCS or SETTLE warning. The error is reproducible across different systems. A simple example system is running an energy minimization on a box of 1000 rigid TIP4P/Ice water molecules generated with gmx solvate. When SETTLE is used as the constraint algorithm, there are several SETTLE warnings in the early steps of the energy minimization, and GROMACS will crash with the above error message. If I replace SETTLE with LINCS, GROMACS crashes with the same error message following a LINCS warning. Other systems that have produced this error are -OH terminated self assembled monolayer surfaces (h-bonds constrained by LINCS), and mica surfaces (h-bonds constrained by LINCS). Naturally, reducing -ntmpi to 1 eliminates the error for all cases. The problem does appear to be hardware dependent. Specifically, the tested node(s) on the cluster contains K20/K40 GPUs with Intel Xeon E5-2680v3 processor (20/24 cores). I used GCC/5.4.0 and CUDA/8.0.44 compilers for installing GROMACS. An installation on my desktop machine with with very similar options does not have the thread MPI error. Example of procedure that causes error: 1. Node contains 24 cores and 2 K40 GPUs gmx solvate -cs tip4p -o box.gro -box 3.2 3.2 3.2 -maxsol 1000 gmx grompp -f em.mdp -c box.gro -p tip4pice.top -o em export OMP\_NUM\_THREADS=6 gmx mdrun -v -deffnm em -ntmpi 4 -ntomp 6 -pin on Attached are the relevant topology (tip4pice.top), mdp (em.mdp), tpr (em.tpr), and log (em.log) files. In addition tip4gro and box.gro files are included. Thanks in advance for any ideas as to what might be causing this problem, Siva Dasetty *(from redmine: issue id 2540, created on 2018-06-01 by gmxdefault, closed on 2018-06-12)* * Changesets: * Revision dce23f771ac909e36815aeb76fe99f9a615bead3 by Berk Hess on 2018-06-06T22:33:48Z: ``` Fix MPI inconsistency in EM after constraint failure Fixes issue #2540 Change-Id: Id18c17af82f80917388c11fc776b79bf4966a4ac ``` * Uploads: * [tip4p.gro](/uploads/0c3409a4b4e37e61f5411120656d1e55/tip4p.gro) input .gro file used in gmx solvate. * [box.gro](/uploads/9f819c8b396b28a120a754b45ddb5483/box.gro) .gro file obtained with gmx solvate. * [em.log](/uploads/4171e2c92074b9f827adcd3f42bff385/em.log) .log file obtained during energy minimization. * [tip4pice.top](/uploads/53cebdc706c1fc58e27bf189732df918/tip4pice.top) TIP4P/Ice topology file. * [em.mdp](/uploads/da7e15035c40b21b905daea7c8d87e54/em.mdp) energy minimization parameter file. * [em.tpr](/uploads/6890d4df04fac597bcc578a8480fb12f/em.tpr) .tpr file (energy minimization of TIP4P/Ice water)
issue