Properly set sub group size for AMD RDNA(2) devices in SYCL kernels
In src/gromacs/nbnxm/sycl/nbnxm_sycl_kernel.cpp
, we always set subGroupSize
to 64 for hipSYCL build on AMD. But newer RDNA and RDNA2 devices support 32-wide execution.
The following discussion from !1221 (merged) should be addressed:
-
@al42and started a discussion: (+3 comments) HIP docs say they always return
warpSize = 64
for AMD devices, but the code seems to indicate that globalwarpSize
constant is properly set for newer hardware.I would expect using
warpSize
here to behave sanely with explicit multipass in hipSYCL, but that's an open question.That said, hipSYCL currently ignores the
reqd_sub_group_size
attribute, so we don't have to care much about the value of this variable.
Parent: #3934
Edited by Andrey Alekseenko