Energy minimisation won't go to the minimum point when GPU is used.
Summary
When using GPU to energy minimise a system, I got a maximum force of 1.2266455e+04, when the same tpr file is used but with the -nb cpu
flag, the maximum force is reduced to 8.8434900e+02.
The simulation is just a simple compound being decoupled in water should should be quite a easy system.
GROMACS version
:-) GROMACS - gmx_mpi, 2022.2 (-:
Executable: /opt/MD-software/gmx-2022.2/bin/gmx_mpi
Data prefix: /opt/MD-software/gmx-2022.2
Working dir: /home/ec2-user/Minimisation/lambda_19
Command line:
gmx_mpi -quiet --version
GROMACS version: 2022.2
Precision: mixed
Memory model: 64 bit
MPI library: MPI (CUDA-aware)
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 128)
GPU support: CUDA
SIMD instructions: AVX2_256
CPU FFT library: fftw-3.3.8-sse2-avx-avx2-avx2_128
GPU FFT library: cuFFT
RDTSCP usage: enabled
TNG support: enabled
Hwloc support: disabled
Tracing support: disabled
C compiler: /opt/amazon/openmpi/bin/mpicc GNU 7.3.1
C compiler flags: -mavx2 -mfma -Wno-missing-field-initializers -fexcess-precision=fast -funroll-all-loops -O3 -DNDEBUG
C++ compiler: /opt/amazon/openmpi/bin/mpicxx GNU 7.3.1
C++ compiler flags: -mavx2 -mfma -Wno-missing-field-initializers -fexcess-precision=fast -funroll-all-loops -fopenmp -O3 -DNDEBUG
CUDA compiler: /usr/local/cuda/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2021 NVIDIA Corporation;Built on Mon_Oct_11_21:27:02_PDT_2021;Cuda compilation tools, release 11.4, V11.4.152;Build cuda_11.4.r11.4/compiler.30521435_0
CUDA compiler flags:-std=c++14;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-Wno-deprecated-gpu-targets;-gencode;arch=compute_53,code=sm_53;-gencode;arch=compute_80,code=sm_80;-use_fast_math;-D_FORCE_INLINES;-mavx2 -mfma -Wno-missing-field-initializers -fexcess-precision=fast -funroll-all-loops -fopenmp -O3 -DNDEBUG
CUDA driver: 11.60
CUDA runtime: 11.40
Steps to reproduce
>>> gmx_mpi grompp -f gromacs.mdp -c gromacs.gro -p gromacs.top -o gromacs.tpr
>>> gmx_mpi mdrun -deffnm gromacs
Steepest Descents converged to machine precision in 74 steps,
but did not reach the requested Fmax < 1000.
Potential Energy = -1.0135666e+05
Maximum force = 1.2266455e+04 on atom 17
Norm of force = 2.1724870e+02
>>> gmx_mpi mdrun -deffnm gromacs -nb cpu
Steepest Descents converged to Fmax < 1000 in 228 steps
Potential Energy = -1.0515849e+05
Maximum force = 8.8434900e+02 on atom 17
Norm of force = 4.4004402e+01
What is the current bug behavior?
Using the GPU will give a high energy state which makes the system less stable.
What did you expect the correct behavior to be?
The GPU version and CPU version should give comparable results.
Edited by Zhiyi Wu