CUDA CC 2.0 issue - Redmine #2273
Using either master HEAD, release-2016 HEAD, or tag v2016: If I target compute and sm 20 (with `-DGMX_CUDA_TARGET_SM=20 -DGMX_CUDA_TARGET_COMPUTE=20`) then by default we get a single CUDA compilation unit (since that’s the only thing that can work). The regressiontests pass, but we we have an issue, e.g. $ bin/mdrun-test --gtest_filter=\*Swap\* ... Running on 1 node with total 4 cores, 8 logical cores, 2 compatible GPUs Hardware detected: CPU info: Vendor: Intel Brand: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz SIMD instructions most likely to fit this hardware: AVX_256 SIMD instructions selected at GROMACS compile time: AVX_256 Hardware topology: Full, with devices GPU info: Number of GPUs detected: 2 #0: NVIDIA GeForce GTX 960, compute ca: 5.2, ECC: no, stat: compatible #1: NVIDIA GeForce GTX 660 Ti, compute ca: 3.0, ECC: no, stat: compatible Reading file /home/marklocal/git/r2016/build-cmake-gcc-gpu-cc20-debug/src/programs/mdrun/tests/Testing/Temporary/CompelTest_SwapCanRun.tpr, VERSION 2016.5-dev-20170923-d36730ca3 (single precision) Using 1 MPI thread Using 1 OpenMP thread 1 GPU user-selected for this run. Mapping of GPU ID to the 1 PP rank in this node: 0 NOTE: Thread affinity setting failed. This can cause performance degradation. If you think your settings are correct, ask on the gmx-users list. SWAP: Determining initial numbers of ions per compartment. SWAP: Setting pointers for checkpoint writing SWAP: Channel 0 flux history for ion type NA+ (charge 1): 0 molecules SWAP: Channel 1 flux history for ion type NA+ (charge 1): 0 molecules SWAP: Channel 0 flux history for ion type CL- (charge -1): 0 molecules SWAP: Channel 1 flux history for ion type CL- (charge -1): 0 molecules starting mdrun 'Channel_coco in octane membrane' 2 steps, 0.0 ps. ------------------------------------------------------- Program: mdrun-test, version 2016.5-dev-20170923-d36730ca3 Source file: src/gromacs/mdlib/nbnxn_cuda/nbnxn_cuda.cu (line 633) Fatal error: cudaStreamSynchronize failed in cu_blockwait_nb: an illegal memory access was encountered For more information and tips for troubleshooting, please check the GROMACS website at http://www.gromacs.org/Documentation/Errors ------------------------------------------------------- If I target compute and sm 30 then, by default, I get multiple CUDA compilation units and there is no issue. If I target compute and sm 30 and set `-DGMX_CUDA_NB_SINGLE_COMPILATION_UNIT=on`, then there is no issue. So it looks to me like something in the CC 2.0 support is broken, or at least not properly used by the mdrun-test code. I’ll try to bisect a bit more and see what I learn. The absence of a reported bug does suggest that there is not much use of release-2016 on CC 2.0, and we should consider removing support for CC 2.0 for GROMACS 2017. This would simplify our texture and CMake code, and remove the question of whether someone should try to cover this case in Jenkins. Clearly nobody has prioritized doing or automating testing on this old setu Note that NVIDIA has already deprecated those compilation targets in nvcc (and we suppress the warning). If we go this path, then I suggest we don’t bother trying to fix release-2016, and if someone later has an issue, suggest they use an even earlier version. *(from redmine: issue id 2273, created on 2017-10-13 by mark.j.abraham, closed on 2017-11-28)* * Changesets: * Revision 29ba77b8483f803766806b4f6987aeeef00747e5 by Szilárd Páll on 2017-10-31T19:19:09Z: ``` Check CUDA available/compiled code compatibility Added an early check to detect when the gmx binary does not embed code compatible with the GPU device it tries to use nor does it have PTX that could have been JIT-ed. Additionally, if the user manually sets GMX_CUDA_TARGET_COMPUTE=20 and no later SM or COMPUTE but runs on >2.0 hardware, we'd be executing JIT-ed Fermi kernels with incorrect host-side code assumptions (e.g amount of shared memory allocated or texture type). This change also prevents such cases. Fixes #2273 Change-Id: I5472b1a33e584a75f451e21e9fd25992633fbea9 ```
issue