Skip to content

Remove MPI bottlenecks in grid initialization

Alberto Garcia requested to merge garalb/siesta:bsc-bottlenecks into dev

This is based on the work of Rogeli Grima (@rgrima) from BSC:

  • Use an asynchronous scheme for the communications involved in the handling of the different grid distributions (routines in 'moremeshsubs'). Extra buffers are needed.

  • In addition, the pre-computation of the communication pattern for the redistribution of orbital-based matrices from block-cyclic to grid-based (code in 'm_dscfcomm') has been changed. It does not use graph coloring anymore as it did not scale well for large numbers of processors and orbitals.

This code has been updated to incorporate the functionality of distributions with variable 'nsm' by Federico Pedron. Extra documentation also by F. Pedron.

ToDo:

  • Fix conflicts with master (see commit 1aea8c50)
  • Make sure that the new scheme is applied to all relevant routines (in particular, to MatrixMtoOC)
  • Clarify changes in m_timer.F90 ("ORIG" way of reporting vs new one)
  • Testing
Edited by Alberto Garcia

Merge request reports

Loading