CUDA enabled configure script
This merge request allows to generate the make.inc
file for CUDA compilation with autotools and provides an automatic building system for the external GPU libraries.
Changes:
- Minimal version for autotools increased from 2.60 to 2.64.
New features:
-
A modified configure script that features three new parameters:
--with-cuda
,--with-cuda-cc
,--with-cuda-runtime
, described by./configure --help
. The configure script checks that: -
PGI compilers are used.
-
nvcc
is available. -
Currently needed libraries for the envisaged porting are present.
-
The configure script generates a new
make.inc
that defines the flag__CUDA
and embeds all the details about the CUDA configuration i.e.: -
CUDA_RUNTIME
: CUDA runtime currently in use. -
CUDA_CC
: compute capabilities to be used in code generation. -
CUDA_EXTLIBS
: external libraries to be configured and generated. Currently only the eigensolver. -
CUDA_LIBS
: libraries to be used at linking time (PGI libraries and external packages) -
A modified
extlibs_makefile
that automatically downloads and generate this eigevsolver. -
A small C code and a python wrapper that can be used to simplify the building process for the user and/or the debugging phase. The C code dumps the details of the GPUs in a yaml file (
device_propc.c
). The Python script collects these information and generate a draft for the configure line.
To be done/discussed before merging:
- If/where/how to document the new configure options: for the developers? For the users?
Align .gitlab-ci.yaml to standard QEF CI.
Known problems:
What follows is a general problem that becomes particularly relevant when building with the PGI compilers. When setting the flag -Mlarge_arrays
in make.inc, this flag is not propagated to the libraries generated by extlibs_makefile
. This is especially problematic for the FoX library, that fails while reading the pseudopotentials for this reason. As a consequence, -Mlarge_arrays
is not currently set in the configure, but will be needed in the future (for large allocations). A solution to the problem of propagating flags to external libraries build system must be found.