Performance hit in 3.3rc1
Submitted by Avi Ginsburg
Assigned to Nobody
Link to original bugzilla bug (#1317)
Version: 3.3 (current stable)
Operating system: Windows
Description
Degraded performance (3.3rc1) for the MCVE below when compared to 3.2.9.
My tested setup MSVC 2013; 64-bit(32-bit performance is identical between versions); w/o AVX to be fair, but both with and w/o SSE). I have not tested (yet, on my todo list) on Ubuntu/gcc.
MCVE:
#include <iostream>
#include <chrono>
// Can be on or off
//#define EIGEN_DONT_VECTORIZE
#ifdef _MSC_VER
#define OLDVER
#ifdef OLDVER
#include <Eigen3.2.9/Eigen/Core>
#else
#include <Eigen3.3rc1/Eigen/Core>
#endif
#else
#include <Eigen/Core>
#endif
int main()
{
int len = 16 * 512;
Eigen::Matrix3Xf l; l.setRandom(3, len);
float res = 0;
std::cout << "Hello Eigen\t";
std::cout << EIGEN_WORLD_VERSION << "."
<< EIGEN_MAJOR_VERSION << "."
<< EIGEN_MINOR_VERSION << "\n";
std::cout << Eigen::SimdInstructionSetsInUse() << "\n";
auto t1 = std::chrono::high_resolution_clock::now();
for (int i = 0; i < len; i++)
{
res += (l.leftCols(i).colwise() - l.col(i)).colwise().norm().eval().sum();
}
auto t2 = std::chrono::high_resolution_clock::now();
auto t = std::chrono::duration_cast <std::chrono::milliseconds>(t2 - t1).count();
std::cout << "timing: " << t << " ms\t" << res << std::endl;
return 0;
}
Outputs:
Hello Eigen 3.2.9
SSE, SSE2
timing: 275 ms 4.44498e+007
Hello Eigen 3.2.94
SSE, SSE2
timing: 733 ms 4.44498e+007
Just a note: Using 3.2.9, the eval is faster than w/o it. Using 3.3rc1, w/o eval is faster than 3.2.9 with eval.
The main issue is (l.leftCols(i).colwise() - l.col(i)).colwise()
. Whatever appears to the right (as long as eval is used) matters less.