OpenMP is disabled for bonded list update in Clang-CUDA builds

Summary

With CUDA-Clang build, we pass -fno-openmp flag to all GPU-related files to work around an old Clang bug under the assumption that no CUDA file uses OpenMP. But that assumption is not valid anymore.

Our CMake code: https://gitlab.com/gromacs/gromacs/blob/bbe83e52dda31475f634993fb5c587753ac24154/cmake/gmxManageClangCudaConfig.cmake#L123-124
LLVM bug: https://bugs.llvm.org/show_bug.cgi?id=45533
MR introducing the OpenMP region in a GPU source file: !3899 (merged)
Similar issue: #4747 / !3540 (merged)

Exact steps to reproduce

$ cmake ../.. -DCMAKE_C_COMPILER=clang-17 -DCMAKE_CXX_COMPILER=clang++-17 -DGMX_GPU=CUDA -DGMX_CLANG_CUDA=ON -DGMX_CUDA_TARGET_SM=86
[...]
$ make -j$(nproc)
[...]
$ touch ../../src/gromacs/listed_forces/listed_forces_gpu_impl_gpu.cpp
$ VERBOSE=1 make
[ 16%] Building CXX object src/gromacs/CMakeFiles/libgromacs.dir/listed_forces/listed_forces_gpu_impl_gpu.cpp.o
[...] /usr/bin/clang++-17 [...] -Wno-source-uses-openmp [...] -fopenmp=libomp [...] -fno-openmp [...]

In this case, -fno-openmp overrides the earlier -fopenmp=libomp and disables the use of OpenMP on the host, which can be seen if we remove -Wno-source-uses-openmp:

/home/aland/gromacs/src/gromacs/listed_forces/listed_forces_gpu_impl_gpu.cpp:192:9: warning: unexpected '#pragma omp ...' in program [-Wsource-uses-openmp]
  192 | #pragma omp parallel for num_threads(gmx_omp_nthreads_get(ModuleMultiThread::Bonded)) schedule(static)
      |         ^
1 warning generated when compiling for sm_86.
/home/aland/gromacs/src/gromacs/listed_forces/listed_forces_gpu_impl_gpu.cpp:192:9: warning: unexpected '#pragma omp ...' in program [-Wsource-uses-openmp]
  192 | #pragma omp parallel for num_threads(gmx_omp_nthreads_get(ModuleMultiThread::Bonded)) schedule(static)
      |         ^
1 warning generated when compiling for host.

The first warning, for the sm_86 pass is expected and is the reason to keep -Wno-source-uses-openmp. The second indicate that OpenMP gets disabled for the host.

For developers: Why is this important?

We should use OpenMP whenever we intend to use it.

Possible fixes

The easiest solution is making the CMake flag addition more narrow. Per https://bugs.llvm.org/show_bug.cgi?id=45533, the bug is fixed in Clang 11, so there is no need to add -fno-openmp with new versions.

Clang-CUDA is not widely used and the performance benefit of !3899 (merged) is not huge, so we can keep it as-is (without parallelization) with Clang 9-10.