infer dimensions of Dynamic-sized temporaries from the entire expression (if possible)
Submitted by Emily
Assigned to Nobody
Link to original bugzilla bug (#1001)
Version: 3.4 (development)
Description
I'm calculating "scalar = vector1.transpose() * matrix * vector2". This is the inner most time critical portion of my code, the expression is evaluated using a temporary matrix on the heap. Vectors 1 & 2 are compile-time fixed size, and the matrix is a block to a dynamically allocated matrix. There should be no need to allocate a temporary on the heap in this case. The heap allocation attributes to over 50% of my total runtime.
I would very much like if this could be optimized to not cause a temporary on the heap.
The following code illustrates my use case:
Eigen::internal::set_is_malloc_allowed(true);
Eigen::ArrayXXd data(1000, 1000);
data.setRandom();
Eigen::internal::set_is_malloc_allowed(false);
Eigen::Ref<const Eigen::Array<double, Eigen::Dynamic, Eigen::Dynamic>> m_coeff(data.block(30,30,4, 4));
double x0 = 0.5;
double x1 = 0.5;
Eigen::Vector4d X0;
X0[0] = x0*(4 - 3 * x0) - 1;
X0[1] = x0*(9 * x0 - 10);
X0[2] = x0*(8 - 9 * x0) + 1;
X0[3] = x0*(3 * x0 - 2);
Eigen::Vector4d X1;
X1[0] = x1*((2 - x1)*x1 - 1);
X1[1] = x1*x1*(3 * x1 - 5) + 2;
X1[2] = x1*((4 - 3 * x1)*x1 + 1);
X1[3] = x1*x1*(x1 - 1);
// Will trigger assert as transpose() is evaluated to a temporary on the heap.
double ans = double(X0.transpose() * (m_coeff.matrix() * X1)) / 4.0;
Depends on
Blocking
Edited by Eigen Bugzilla