Skip to content

Performance improvement for DFT+U on GPUs

Nicolas Tancogne-Dejean requested to merge improved_gpu_support_dftu into main

Description

Performance improvement for DFT+U on GPUs:

  • Batchify the calculation of the occupation matrices.
  • Improving the performance of the DFT+U bra kernel for the complex mesh case.

All the DFT+U tests are now running on GPUs thanks to the recent changes (overlapping spheres, SOC).

News snippet

Performance improvement for DFT+U on GPUs.

Checklist

  • I have checked that my code follows the Octopus coding standards
  • I have added tests for all the new features added in this request.
Edited by Nicolas Tancogne-Dejean

Merge request reports