Vectorization of Matrix*Vector with aligned matrix and unaligned vector
@chhtz
Submitted by Christoph HertzbergAssigned to Nobody
Link to original bugzilla bug (#359)
Version: 3.0
Platform: x86 - SSE
Description
The following is very well vectorized for even DIM but not for odd, although the vector is only accessed by movddup which does not require alignment.
enum{DIM = 4}; // does not vectorize for odd values
void mult_test(const Eigen::Matrix<double, 2, DIM> &a, Eigen::Matrix<double, DIM, 1> &v)
{
Eigen::Vector2d temp;
EIGEN_ASM_COMMENT("mult_test");
temp = a * v;
EIGEN_ASM_COMMENT("mult-transposed_test");
v = a.transpose() * temp;
EIGEN_ASM_COMMENT("end of mult_test");
}
Furthermore in the transposed product the following lines
haddpd %xmm1, %xmm1
haddpd %xmm0, %xmm0
movlpd %xmm1, -24(%ebp)
movlpd %xmm0, -16(%ebp)
(some independent code)
movapd -24(%ebp), %xmm0
could be replaced by a simple:
haddpd %xmm1, %xmm0
maybe the other way around, I sometimes mix this up. The point is that haddpd can calculate two horizontal sums at the same time and already store it packed into a single register:
http://www.rz.uni-karlsruhe.de/rz/docs/VTune/reference/HADDPD--Packed_Double-FP_Horizontal_Add.htm