An "Assertion failed" error occurs in GROMACS 2023.1 when CUDA Graphs feature and gmx mdrun "-bonded gpu" argument available at the same time
**Summary** Hi, An "Assertion failed" error occurs when I attempt to set CUDA Graphs feature and gmx mdrun "-bonded gpu" argument available at the same time in GROMCACS 2023.1. Here is environment setting about GROMACS: ```shell #Gromacs export GMX_GPU_DD_COMMS=true export GMX_CUDA_GRAPH=true export GMX_GPU_PME_DECOMPOSITION=true export GMX_GPU_PME_PP_COMMS=true export GMX_FORCE_UPDATE_DEFAULT_GPU=true ``` And, Here are error messages: ```shell Program: gmx mdrun, version 2023.1 Source file: src/gromacs/gpu_utils/device_stream.cu (line 100) Function: DeviceStream::synchronize() const::<lambda()> Assertion failed: Condition: stat == cudaSuccess cudaStreamSynchronize failed. CUDA error #400 (cudaErrorInvalidResourceHandle): invalid resource handle. For more information and tips for troubleshooting, please check the GROMACS website at http://www.gromacs.org/Documentation/Errors ``` **GROMACS version** ```shell GROMACS version: 2023.1 Precision: mixed Memory model: 64 bit MPI library: thread_mpi OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 128) GPU support: CUDA NB cluster size: 8 SIMD instructions: AVX2_256 CPU FFT library: fftw-3.3.10-sse2-avx-avx2-avx2_128 GPU FFT library: cuFFT Multi-GPU FFT: none RDTSCP usage: enabled TNG support: enabled Hwloc support: disabled Tracing support: disabled C compiler: /usr/bin/gcc-11 GNU 11.3.0 C compiler flags: -fexcess-precision=fast -funroll-all-loops -mavx2 -mfma -Wno-missing-field-initializers -O3 -DNDEBUG C++ compiler: /usr/bin/g++-11 GNU 11.3.0 C++ compiler flags: -fexcess-precision=fast -funroll-all-loops -mavx2 -mfma -Wno-missing-field-initializers -Wno-cast-function-type-strict -fopenmp -O3 -DNDEBUG BLAS library: External - detected on the system LAPACK library: External - detected on the system CUDA compiler: /opt/cuda/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2023 NVIDIA Corporation;Built on Mon_Apr__3_17:16:06_PDT_2023;Cuda compilation tools, release 12.1, V12.1.105;Build cuda_12.1.r12.1/compiler.32688072_0 CUDA compiler flags:-std=c++17;--generate-code=arch=compute_50,code=sm_50;--generate-code=arch=compute_52,code=sm_52;--generate-code=arch=compute_60,code=sm_60;--generate-code=arch=compute_61,code=sm_61;--generate-code=arch=compute_70,code=sm_70;--generate-code=arch=compute_75,code=sm_75;--generate-code=arch=compute_80,code=sm_80;--generate-code=arch=compute_86,code=sm_86;--generate-code=arch=compute_89,code=sm_89;--generate-code=arch=compute_90,code=sm_90;-Wno-deprecated-gpu-targets;--generate-code=arch=compute_53,code=sm_53;--generate-code=arch=compute_80,code=sm_80;-use_fast_math;-Xptxas;-warn-double-usage;-Xptxas;-Werror;-D_FORCE_INLINES;-fexcess-precision=fast -funroll-all-loops -mavx2 -mfma -Wno-missing-field-initializers -Wno-cast-function-type-strict -fopenmp -O3 -DNDEBUG CUDA driver: 12.10 CUDA runtime: 12.10 ``` **Steps to reproduce** Firstly, set GMX_CUDA_GRAPH as available, using shell command: ```shell export GMX_CUDA_GRAPH=true ``` And then, run my simulation via gmx mdrun. Here is my command: ```shell gmx mdrun -v -deffnm Produtcion -s Production.tpr -ntomp 12 -pin on -ntmpi 1 -update gpu -bonded gpu ``` An "Assertion failed" error occurs. ```shell :-) GROMACS - gmx mdrun, 2023.1 (-: Executable: /home/yangzichen/Software/GMX-2023.1/bin/gmx Data prefix: /home/yangzichen/Software/GMX-2023.1 Working dir: /home/yangzichen/Documents/ZZH/MD Command line: gmx mdrun -v -deffnm Produtcion -s Production.tpr -ntomp 12 -pin on -ntmpi 1 -update gpu -bonded gpu Back Off! I just backed up Produtcion.log to ./#Produtcion.log.1# Reading file Production.tpr, VERSION 2023.1 (single precision) GMX_CUDA_GRAPH environment variable is detected. The experimental CUDA Graphs feature will be used if run conditions allow. Update groups can not be used for this system because atoms that are (in)directly constrained together are interdispersed with other atoms 1 GPU selected for this run. Mapping of GPU IDs to the 2 GPU tasks in the 1 rank on this node: PP:0,PME:0 PP tasks will do (non-perturbed) short-ranged and most bonded interactions on the GPU PP task will update and constrain coordinates on the GPU PME tasks will do all aspects on the GPU CUDA Graphs will be used, provided there are no CPU force computations. Using 1 MPI thread Using 12 OpenMP threads Back Off! I just backed up Produtcion.xtc to ./#Produtcion.xtc.1# Back Off! I just backed up Produtcion.edr to ./#Produtcion.edr.1# starting mdrun '01-JUL-22 C3G and BSA in 0.154M NaCl running 1ns in water' 1000000 steps, 2000.0 ps. step 400 ------------------------------------------------------- Program: gmx mdrun, version 2023.1 Source file: src/gromacs/gpu_utils/device_stream.cu (line 100) Function: DeviceStream::synchronize() const::<lambda()> Assertion failed: Condition: stat == cudaSuccess cudaStreamSynchronize failed. CUDA error #400 (cudaErrorInvalidResourceHandle): invalid resource handle. For more information and tips for troubleshooting, please check the GROMACS website at http://www.gromacs.org/Documentation/Errors ------------------------------------------------------- ``` I try to run simulations using same environment setting and mdrun command on other computer via GROMACS 2023.1 compiled by CUDA 11.7 and GCC 11.2. Assertion failed still exist. And, I annotated "GMX_CUDA_GRAPH=true" and run command: ```shell gmx mdrun -v -deffnm Produtcion -s Production.tpr -ntomp 12 -pin on -ntmpi 1 -update gpu -bonded gpu ``` It worked, so confirmed the GMX_CUDA_GRAPH and bonded gpu in conflict. **Possible fixes** Do NOT CUDA Graphs feature and gmx mdrun "-bonded gpu" argument available at the same time.
issue