nvfortran -O3 caused illegal memory access in CUDA build
-
QE version current develop 9d15f717
-
Main hardware and compilation information Using nvhpc 21.3 on Skylake + V100
cmake -DCMAKE_C_COMPILER=nvc -DCMAKE_Fortran_COMPILER=nvfortran -DQE_ENABLE_CUDA=ON -DQE_ENABLE_MPI=OFF -DQE_FFTW_VENDOR=Internal ..
- input data and pseudopotential files, or (better) links to them from test-suite
ctest --output-on-failure -L 'pw' -LE epw
- ALL information needed to reproduce the problem.
cmake Release build adds -fast -O3
to the compile line and -O3
caused
cudaMemcpy returned status 700: an illegal memory access was encountered
cuda-gdb shows failure from stres_us_gpu.f90
but potentially more files can be affected.
The following tests are affected
system--pw_atom--atom-pbe
system--pw_atom--atom-sigmapbe
system--pw_b3lyp--b3lyp-h2o
system--pw_uspp--uspp1-coulomb
system--pw_workflow_vc-relax_dos--vc-relax-dos-1
system--pw_workflow_vc-relax_dos--vc-relax-dos-2
system--pw_workflow_vc-relax_scf--vc-relax-scf-1
if you are not using cmake but configure, -O3
is not added and the build is safe from this bug.
I will provide a cmake workaround shortly.
I think the possible root cause are
1. nvfortran compiler bug
2. source code has bug but somehow hidden when -O3 is not used.
@fspiga as this is related to nvfortran
Edited by Ye Luo