More SOC on GPU
Description
Move the operations for SOC to GPU to avoid unnecessary CPU-GPU data transfers.
- adapt bra and ket kernels for complex values
- add new kernel for complex mixing
- adapt projector_bra_force_complex kernel to complex matrices
- add profiling region for mixing part of code
- use single exit point for X(hamiltonian_elec_base_nlocal_start) and X(hamiltonian_elec_base_nlocal_finish) subroutines
- adapt SOC test (linking to scalapack, and AllowCPUonly option are not required anymore)
Checklist
-
I have checked that my code follows the Octopus coding standards -
I have added tests for all the new features added in this request.
Edited by Meisam Tabriz