Incorrectly getting "Enabling single compilation unit for the CUDA non-bonded module." when using Cuda 9.0 and gromacs 2018.1 - Redmine #2561

changed milestone to %2018.3

added Bug build system + 1 deleted label

I can’t reproduce the issue. Can you please provide full command line invocation?

(from redmine: written on 2018-07-17 by pszilard)

cmd line used:
cmake ../gromacs-2018.2 -DCMAKE_INSTALL_PREFIX=/scratch/ake/gromacs/inst -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_BUILD_TYPE=Release -DGMX_EXTERNAL_BLAS=ON -DGMX_EXTERNAL_LAPACK=ON -DGMX_X11=OFF -DGMX_OPENMP=ON -DGMX_MPI=OFF -DGMX_GPU=ON -DCUDA_TOOLKIT_ROOT_DIR=/hpc2n/eb/software/Compiler/GCC/7.3.0-2.30/CUDA/9.2.88 -DGMX_BLAS_USER=“/hpc2n/eb/software/Compiler/GCC/7.3.0-2.30/OpenBLAS/0.3.1/lib/libopenblas.a” -DGMX_LAPACK_USER=“/hpc2n/eb/software/Compiler/GCC/7.3.0-2.30/OpenBLAS/0.3.1/lib/libopenblas.a”

cmake.out 2>&1 (closed)

grep ‘Enabling single compilation’ cmake.out
— Enabling single compilation unit for the CUDA non-bonded module. Multiple compilation units are not compatible with CC 2.x devices, to enable the feature specify only CC >=3.0 target architectures in GMX_CUDA_TARGET_SM/GMX_CUDA_TARGET_COMPUTE.

(from redmine: written on 2018-07-17 by akesandgren)

Åke Sandgren wrote:

cmd line used:
cmake ../gromacs-2018.2 -DCMAKE_INSTALL_PREFIX=/scratch/ake/gromacs/inst -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_BUILD_TYPE=Release -DGMX_EXTERNAL_BLAS=ON -DGMX_EXTERNAL_LAPACK=ON -DGMX_X11=OFF -DGMX_OPENMP=ON -DGMX_MPI=OFF -DGMX_GPU=ON -DCUDA_TOOLKIT_ROOT_DIR=/hpc2n/eb/software/Compiler/GCC/7.3.0-2.30/CUDA/9.2.88 -DGMX_BLAS_USER=“/hpc2n/eb/software/Compiler/GCC/7.3.0-2.30/OpenBLAS/0.3.1/lib/libopenblas.a” -DGMX_LAPACK_USER=“/hpc2n/eb/software/Compiler/GCC/7.3.0-2.30/OpenBLAS/0.3.1/lib/libopenblas.a” cmake.out 2>&1 (closed)

This does not contain either of the GMX_CUDA_TARGER_SM or GMX_CUDA_TARGER_COMPUTE passed to cmake to restrict the targetted architectures.

(from redmine: written on 2018-07-17 by pszilard)

Exactly, it doesn’t contain them, thus the code should take the default of using what the CUDA supports.
And there is code in there to figure this out based on CUDA version, but it isn’t used. The result is just ignored.

With the code shown in the original post, everything works in both cases, with and without GMX_CUDA_TARGET_SM or GMX_CUDA_TARGET_COMPUTE

(from redmine: written on 2018-07-17 by akesandgren)

With

export CC=gcc-6
export CXX=g++-6

cmake .. -DCMAKE_INSTALL_PREFIX=/opt/gromacs/$version -G Ninja

I see

...
-- Enabling single compilation unit for the CUDA non-bonded module. Multiple compilation units are not compatible with CC 2.x devices, to enable the feature specify only CC >=3.0 target architectures in GMX_CUDA_TARGET_SM/GMX_CUDA_TARGET_COMPUTE.
...

which is the expected result (supporting 2.x devices with CUDA < 9 requires a single compilation unit in 2018; CC 2.x is now not supported at all in master) if perhaps a little tricky for users to understand whether they should act. (If they know about compute capabilities, it’s not clear that by default GROMACS builds for all of them, and if they don’t know about compute capabilities then nothing makes sense.)

cmake .. -DCMAKE_INSTALL_PREFIX=/opt/gromacs/$version -G Ninja -DGMX_CUDA_TARGET_SM=30 -DGMX_CUDA_TARGET_COMPUTE=30

enables multiple compilation units, as expected since that is now possible.

With the reform proposed in #2390, then we can avoid some aspects of such problems in future, but for now Ake’s proposal looks good to me. Szilard can you prepare a patch, please?

(from redmine: written on 2018-07-18 by mark.j.abraham)

Åke Sandgren wrote:

Exactly, it doesn’t contain them, thus the code should take the default of using what the CUDA supports.

Sorry, forgot that CUDA 9.0 skips generating CC 2.0 code.

IIRC the reason why the current code does not just test CUDA_NVCC_FLAGS is that the compiler can generate code for CC 2.0 despite a command line that does not contain "_2[01]". For the same reason, the proposed change could also lead to not triggering single compilation unit for some CUDA version <9.0 for a CUDA_NVCC_FLAGS string that implicitly triggers CC 2.0 code generation. I’d rather make the code explicit and add a CUDA version check to the conditional than take the risk (or trawl through the nvcc docs).

(from redmine: written on 2018-07-18 by pszilard)

If we test for the features supported by nvcc then we will have a good way to guess whether it is at all capable of e.g. generating code for CC 2.0 (which Szilard identifies as a risk). If the user chooses particular CC, then we check in cmake that this will work. If the user chooses multiple compilation units but CC 2.0 is supported, then cmake should fail (inconsistent input). If the user chooses nothing and CC 2.0 is supported, then we default to single compilation unit. If the user chooses nothing and CC 2.0 is not supported, then we default to multiple compilation units.

I don’t see any extra value delivered by any CUDA version checks. Different OS first support newer CC in different CUDA versions, so such checks are brittle. They can’t help the build work with unknown future CUDA versions. This is the same thinking behind taking advantage of compiler and glibc feature-reporting gear, and testing functionality at configure time, rather than hard coding specific version checks.

(from redmine: written on 2018-07-18 by mark.j.abraham)

Archive from user: Gerrit Code Review Bot

Gerrit received a related patchset ‘1’ for Issue #2561.
Uploader: Szilárd Páll (pall.szilard@gmail.com)
Change-Id: gromacs~~release-2018~~I741081da39539f211fad32bda5c1f1dccc4378f9
Gerrit URL: https://gerrit.gromacs.org/8106

(from redmine: written on 2018-07-19 by gmxdefault)

Mark Abraham wrote:

If we test for the features supported by nvcc then we will have a good way to guess whether it is at all capable of e.g. generating code for CC 2.0 (which Szilard identifies as a risk).

How do you suggest testing individual features of nvcc? (Writing a bunch of execute_process calls with manual regex matching is fragile and tedious IMO).

If the user chooses particular CC, then we check in cmake that this will work. If the user chooses multiple compilation units but CC 2.0 is supported, then cmake should fail (inconsistent input). If the user chooses nothing and CC 2.0 is supported, then we default to single compilation unit. If the user chooses nothing and CC 2.0 is not supported, then we default to multiple compilation units.

What we don’t check for is default behavior of nvcc, e.g. when -gencode flags are cleared. I’ve just checked and we don’t accept empty GMX_CUDA_TARGET_* to clear the list of targeted arch, so only the fully manual override would be able to trigger that case (and that can be triggered regardless of how this fix is implemented). Admittedly, this case would however be caught by a compile-time error, so there’s no risk of incorrect results.

I don’t see any extra value delivered by any CUDA version checks.

Check. a Single one would be needed for this specific fix:

if((CUDA_VERSION VERSION_LESS "9.0") AND
   ((NOT GMX_CUDA_TARGET_SM AND NOT GMX_CUDA_TARGET_COMPUTE) OR
    (GMX_CUDA_TARGET_SM MATCHES "2[01]" OR GMX_CUDA_TARGET_COMPUTE MATCHES "2[01]")))

and this doesn’t need to change later as CC 2.0 support won’t be re-introduced later, I think.

Different OS first support newer CC in different CUDA versions, so such checks are brittle. They can’t help the build work with unknown future CUDA versions. This is the same thinking behind taking advantage of compiler and glibc feature-reporting gear, and testing functionality at configure time, rather than hard coding specific version checks.

Not sure what you mean — hardware support is purely CUDA version-dependent, operating systems / distros have no say in it. I don’t think this is similar at all to glibc-related checks, CUDA has a single implementation with clear deprecation version cutoffs.

Back to the main point: I think for this specific fix the change proposed by Åke has the advatage of slight simplification, but the slight drawback that it does gmx_check_if_changed(_gmx_cuda_target_changed GMX_CUDA_NVCC_FLAGS) which is a slight overkill in that any change in any of the C or nvcc flags will trigger a status message.

(from redmine: written on 2018-07-19 by pszilard)

Szilárd Páll wrote:

Mark Abraham wrote:

If we test for the features supported by nvcc then we will have a good way to guess whether it is at all capable of e.g. generating code for CC 2.0 (which Szilard identifies as a risk).

How do you suggest testing individual features of nvcc? (Writing a bunch of execute_process calls with manual regex matching is fragile and tedious IMO).

We wrap the execute_process() call of nvcc in a function that takes the name of a cache variable to set with the result of the call to nvcc, passing an optional extra flag. The current check for basic functionality of nvcc is just the special case of no extra flag. See e.g. https://redmine.gromacs.org/projects/gromacs/repository/revisions/master/entry/cmake/CheckCXXCompilerFlag.cmake and the way that cmake’s Modules/CheckCXXSourceCompiles.cmake wraps a try_compile(). So we end up with a collection of cache variables named like NVCC_SUPPORTS_CC_2_0 etc.

If the user chooses particular CC, then we check in cmake that this will work. If the user chooses multiple compilation units but CC 2.0 is supported, then cmake should fail (inconsistent input). If the user chooses nothing and CC 2.0 is supported, then we default to single compilation unit. If the user chooses nothing and CC 2.0 is not supported, then we default to multiple compilation units.

What we don’t check for is default behavior of nvcc, e.g. when -gencode flags are cleared. I’ve just checked and we don’t accept empty GMX_CUDA_TARGET_* to clear the list of targeted arch, so only the fully manual override would be able to trigger that case (and that can be triggered regardless of how this fix is implemented). Admittedly, this case would however be caught by a compile-time error, so there’s no risk of incorrect results.

I don’t see any extra value delivered by any CUDA version checks.

Check. a Single one would be needed for this specific fix:
[…]
and this doesn’t need to change later as CC 2.0 support won’t be re-introduced later, I think.

Yeah something like that works, but it does not address the same problem that we will have in the future with CC 3.0 being deprecated and removed.

Different OS first support newer CC in different CUDA versions, so such checks are brittle. They can’t help the build work with unknown future CUDA versions. This is the same thinking behind taking advantage of compiler and glibc feature-reporting gear, and testing functionality at configure time, rather than hard coding specific version checks.

Not sure what you mean — hardware support is purely CUDA version-dependent, operating systems / distros have no say in it.

Different CUDA versions support different compute capabilities on different OS. See #2390. We should test the compiler for feature capabilities.

I don’t think this is similar at all to glibc-related checks, CUDA has a single implementation with clear deprecation version cutoffs.

They’re not clear in advance, so we are currently shipping code that will not compile on some future version of CUDA. AFAIK there is no published deprecation/removal plan for CC 3.0, for example. Using feature checks on the compiler is more future proof

——

Back to the main point: I think for this specific fix the change proposed by Åke has the advatage of slight simplification, but the slight drawback that it does gmx_check_if_changed(_gmx_cuda_target_changed GMX_CUDA_NVCC_FLAGS) which is a slight overkill in that any change in any of the C or nvcc flags will trigger a status message.

(from redmine: written on 2018-07-30 by mark.j.abraham)

Hello, can this be closed now, or are there still issue open for discussion here? If so, I’ll move the target to 2019.

(from redmine: written on 2018-08-15 by acmnpv)

Szilard’s patch resolves the issue in the short term, but IMO there is still a long term problem that has not been resolved. Let’s close this, and I will note in #2390 that there is more consideration required, either for 2018.x or 2019.

(from redmine: written on 2018-08-20 by mark.j.abraham)

Applied in changeset commit:1982bad5.

(from redmine: written on 2018-08-20 by pszilard)

closed

mentioned in commit 1982bad5

Incorrectly getting "Enabling single compilation unit for the CUDA non-bonded module." when using Cuda 9.0 and gromacs 2018.1 - Redmine #2561

Designs

Child items ...

Activity