Skip to content

Draft: XClib and V on cuda

Fabrizio Ferrari requested to merge fabrizio22/q-e:XClib-Vxc-cuda into develop

The XClib library for exchange-correlation (XC) functionals has been ported on gpu (cuda-Fortran). The XC wrapper routines called in QE have been replaced by interfaces that calls the cpu or gpu versions of the driver routines, depending on the attributes of the input variables. For the moment the functional routines have been duplicated in order to allow the use of CPU and GPU drivers at the same time in QE. If and when the charge density will be ported on device and/or openACC will be adopted there will be no need for the gpu-doubles in any case.
#281 (closed) The main routines related to the potential and energy calculation have been ported too (starting from v_of_rho in PW down to all dependencies). The dependency scheme of the ported routines is shown in the scheme below.

In v_of_rho:

  • v_xc_gpu [lda and gga energy and potential];
  • v_xc_meta_gpu [metagga energy and potential];
  • v_h_gpu [hartree term];

In v_xc:

  • xc [from XClib, lda functionals wrapper];
  • gradcorr_gpu [GGA energy and potential terms]
    • compute_rho_gpu [noncollinear case, diagonalizes spin density];
    • rho_r2g_gpu [from Modules/fft_rho.f90, charge density in G-space];
      • fftx_threed2oned_gpu [from FFTXlib, 3d to 1d array in Fourier space];
    • fft_gradient_g2r [from Modules/gradutils.f90, gradient calculator];
    • xc_gcx [from XClib, GGA functionals wrapper].

In v_xc_meta:

  • fft_gradient_g2r;
  • xc_metagcx [from XClib, metagga functionals wrapper]

In XClib each functional routine is defined on device through 'device' attribute and they are called in the main cuf kernel loop in each family driver (qe_lda_driver.f90, qe_gga_driver.f90, ...)

Some file splitting or reorganization might still be implemented.

Edited by Fabrizio Ferrari

Merge request reports