SIMD version of the free-energy kernel - Redmine #2875
The performance of the slow, serial free-energy non-bonded kernel is a serious bottleneck in free-energy simulations. Using SIMD should give a significant speed-up of this compute intensive kernel. Since the nbnxm non-bonded scheme for the normal non-bonded interactions is not very beneficial here, the plan is to use simple vectorization over the j-particles. This leads to many SIMD gather loads and a few scatter force writes, but hopefully most of this can be hidden by arithmetic. The kernel function should be templated on real/SIMD type, so there is only a single code for both plain-C and SIMD.
(from redmine: issue id 2875, created on 2019-03-02 by berkhess)
- Relations:
- parent #742 (closed)
Remaining (post-merge) TODO:
-
evaluate/test on multiple platforms (including x86 not covered in testing) -
kernel code cleanup: change for loop "s" variable to "i" in the simdRealWidth loops (and possibly change names in other for loops) to avoid names that are not clearly just a loop variable, but still not descriptive.
- [ ] kernel code cleanup: possibly change from gmx_unused to [[maybe_unused]].
-
evaluate whether we should force loop enrolling for N_STATES