Skip to content

Fix band group parallelism in DFPT+U

Jae-Mo Lihm requested to merge jmlihm/q-e:dfpt_bands_fix into develop

DFPT+U with band group parallelization was giving incorrect results. Here I fix them.

  • mp_sum should be called over intra_bgrp_comm instead of intra_pool_comm when the quantity is not band-distributed
  • use_bgrp_in_hpsi needs to be set to .false. in lr_orthoUwfc (as done in PW/orthoUwfc) because it calls h_psi and s_psi. (Or, lr_orthoUwfc should deal with band parallelization in computing bec.)
  • There was a wrong factor in zstar_eu_us.f90 (the same thing is computed on all cores, later mp_summed, so one divides by the number of cores. Since the quantity is distributed among bands, the factor should include only n_PW * n_pool, not n_bgrp

Benchmark

Tested using test_suite/ph_U_insulator_PAW, BN.phG.in

develop, -np 1 -nb 1


 freq (    1) =       1.503409 [THz] =      50.148314 [cm-1]
 freq (    2) =       1.503409 [THz] =      50.148314 [cm-1]
 freq (    3) =       1.826946 [THz] =      60.940368 [cm-1]
 freq (    4) =      25.447537 [THz] =     848.838470 [cm-1]
 freq (    5) =      40.996606 [THz] =    1367.499592 [cm-1]
 freq (    6) =      40.996606 [THz] =    1367.499592 [cm-1]

This MR, -np 1 -nb 1


 freq (    1) =       1.503409 [THz] =      50.148314 [cm-1]
 freq (    2) =       1.503409 [THz] =      50.148314 [cm-1]
 freq (    3) =       1.826946 [THz] =      60.940368 [cm-1]
 freq (    4) =      25.447537 [THz] =     848.838470 [cm-1]
 freq (    5) =      40.996606 [THz] =    1367.499592 [cm-1]
 freq (    6) =      40.996606 [THz] =    1367.499592 [cm-1]

develop, -np 3 -nb 3 (INCORRECT)


 freq (    1) =      14.810254 [THz] =     494.016888 [cm-1]
 freq (    2) =      14.810254 [THz] =     494.016888 [cm-1]
 freq (    3) =      20.489036 [THz] =     683.440683 [cm-1]
 freq (    4) =      50.867129 [THz] =    1696.744792 [cm-1]
 freq (    5) =      54.340698 [THz] =    1812.610592 [cm-1]
 freq (    6) =      54.340698 [THz] =    1812.610592 [cm-1]

This MR, -np 3 -nb 3


 freq (    1) =       1.503409 [THz] =      50.148314 [cm-1]
 freq (    2) =       1.503409 [THz] =      50.148314 [cm-1]
 freq (    3) =       1.826946 [THz] =      60.940368 [cm-1]
 freq (    4) =      25.447537 [THz] =     848.838470 [cm-1]
 freq (    5) =      40.996606 [THz] =    1367.499592 [cm-1]
 freq (    6) =      40.996606 [THz] =    1367.499592 [cm-1]

Edited by Jae-Mo Lihm

Merge request reports