Backport missing synchronization calls in CUDA version of LINCS
The following discussion from !2009 (merged) should be addressed:
-
@artemzhmurov started a discussion: (+1 comment) Revealed by !1990 (merged) . Affects previous releases as well as master. Observed deviations caused by this bug are not significant, which is why the code still passes all the previous tests.
Clearly should be fixed in release-2021.
If garbage could end up in the virial, we should fix release-2020 also.