Performance improvement for DFT+U on GPUs
Description
Performance improvement for DFT+U on GPUs:
- Batchify the calculation of the occupation matrices.
- Improving the performance of the DFT+U bra kernel for the complex mesh case.
All the DFT+U tests are now running on GPUs thanks to the recent changes (overlapping spheres, SOC).
News snippet
Performance improvement for DFT+U on GPUs.
Checklist
-
I have checked that my code follows the Octopus coding standards -
I have added tests for all the new features added in this request.
Edited by Nicolas Tancogne-Dejean