Index the boundary links for the smearing process
Summary
The boundary links needed for the smearing process needs to be identified and indexed similar to how the boundary links are treated by OpenQCD. It makes sense to mimic how this is done in the uidx() function which computes the offsets of the positive links.
Implementation details
Smearing has a slightly different ordering scheme for type 2 links, namely we need type 2 links \pm
from the neighbouring site of every odd point in the boundary instead of the positive links from both even and odd points on the boundary. Due to this difference I have decided to pad two times the boundary storage to the gauge field to keep these two sets independent. This is to some extent wasteful, however, one would need to do the extra communication regardless, so communicating more information isn't necessarily that big of a deal. The additional padding is organised as follows:
-
BNDRY / 4
: type 1 links on faces+0, +1, +2, +3
-
BNDRY / 4
: type 1 links on faces-0, -1, -2, -3
-
3 * BNDRY / 2
: type 2 links on faces+0, +1, +2, +3
-
3 * BNDRY / 4
: type 2 links on faces-0, -1, -2, -3
The type 2 boundary links are organised as follows:
For every odd point
x
on the\nu
boundary,U(x+\nu, \mu)
is stored at position6*ix + 2*k
, wherek
is a counter of[0,...,3] / \nu
.An odd point
x
on the\nu
boundary,U(x+\nu, -\mu)
is stored at position6*ix + 2*k + 1
.
Remaining tasks
-
implement smeared_idx() which returns a pointer to uidx_t[8] array -
test the implementation
Alternate solutions
Instead of communicating links on both the positive and negative faces it is possible to compute every plaquette's contribution to to \Omega_{\mu}
matrix and then communicate these contributions afterwards, similar to how F_{\mu\nu}
is computed by OpenQCD.
Pros
- do not need an additional communication layer
- could possibly make the smearing loop more optimised by computing them similarly to how
F_{\mu\nu}
is computed
Cons
- the
\Omega
matrices aren't actually needed anywhere else, so in contrast toF_{\mu\nu}
the memory will work as a very large additional workspace - it will complicate the unsmearing of the fermionic force as this also requires the boundary terms
- with a padded boundary we will possibly not require any additional communications for the unsmearing routine