Fix vectorized Jacobi Rotation
What does this implement/fix?
There seems to be a bug in the apply_rotation_in_the_plane_selector
, so the packet math vectorized version is never used. (Modern clang and gcc seem to vectorize the default version pretty well FWIW.)
This also makes some fixes to get the "fixed-size" code path to pass the test suite.
Additional information
Just for reference, this seems to be the reason that the packet-math version isn't being used atm:
const bool Vectorizable = (int(VectorX::Flags) & int(VectorY::Flags) & PacketAccessBit) ...
Vectorizable
is always false because VectorX
and VectorY
are block expressions, which seem to not set the PacketAccessBit at all. Doing something like checking the Flags
of the block evaluator
instead seems to work better:
const bool Vectorizable = (int(evaluator<VectorX>::Flags) & int(evaluator<VectorY>::Flags) & PacketAccessBit) ...
Edited by Arthur