Massive speed degeneration from 3.3.7 to 3.4.0
Summary
The execution speed of some basic Eigen functionality degenerated from 3.3.7 to 3.4.0.
Environment
- Operating System : Windows
- Architecture : x64
- Eigen Version : 3.3.7, 3.4.0, master
-
Compiler Version:
- Microsoft (R) C/C++ Optimizing Compiler Version 19.37.32825 for x64
- g++ (Ubuntu 12.3.0-1ubuntu1~22.04) 12.3.0 (WSL 2)
- Compile Flags : /O2 (MS VC) -O3 (G++)
- Vector Extension : SSE/AVX but wasn't explicitly specified.
Minimal Example
https://gitlab.com/lbenner/eigenspeedtest
static void BM_EigenCheck(benchmark::State& state)
{
double res = 0.0;
Eigen::Matrix3d A;
double var = (std::rand() % 1000) / 10.0;
for (auto _ : state)
{
A << 1.0, 2.0, 3.0,
4.0, var, 6.0,
7.0, 8.0, 9.0;
Eigen::Vector3d v(1.0, 2.0, 3.0);
Eigen::Vector3d x = A * v;
Eigen::Vector3d y = A.transpose() * v;
Eigen::Vector3d d = x - y;
var = A(1,1) + 0.00001;
double r = d.norm();
A.row(1) -= r * v;
res = A(1, 1);
}
results[1] = res;
}
Steps to reproduce
- git clone https://gitlab.com/lbenner/eigenspeedtest.git
- cd eigensspeedtest
- git submodule update --init --recursive
- mkdir build && cd build
- cmake -G Ninja -DCMAKE_BUILD_TYPE=Release ..
- cmake --build .
- EigenSpeed
By running
git checkout <eigen branch>
inside of the Eigen directory, one can change between the different Eigen versions.
What is the current bug behavior?
With Visual Studio 2022 the execution speed of Eigen dropped massively.
||Version|| VS 2022 || G++ 12.3 ||
|---------|----------|-----------|
| 3.3.7 | 2.34 ns | 20.0 ns |
| 3.4.0 | 24.9 ns | 20.3 ns |
| master | 26.1 ns | 20.4 ns |
The Eigen version was the only difference, all other settings remained unchanged.
What is the expected correct behavior?
For the provided benchmark the performance of Eigen 3.4.0 and later should be as good as for 3.3.7. (We tested also other 3.3.x versions, which are all comparable.)
Of course it would be good if the G++ results would be as fast as the Eigen 3.3.7
Relevant logs
No logs attached.
Benchmark scripts and results
Code is provide in the linked GitLab repo. The focus is on the BM_EigenCheck benchmark. G++ might be to clever for the other and optimized it away. And the Visual Studio result did not change between the different versions.
Visual Studio 2022 - Eigen 3.3.7
---------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------
BM_FillMatrix3d 0.203 ns 0.188 ns 1000000000
BM_EigenCheck 2.94 ns 2.34 ns 280000000
Visual Studio 2022 - 3.4
----------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------
BM_FillMatrix3d 0.201 ns 0.109 ns 1000000000
BM_EigenCheck 24.9 ns 11.5 ns 74666667
Visual Studio 2022 - Latest
----------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------
BM_FillMatrix3d 0.208 ns 0.172 ns 1000000000
BM_EigenCheck 26.1 ns 21.8 ns 34461538
Linux G++ - 3.3.7
----------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------
BM_FillMatrix3d 0.000 ns 0.000 ns 1000000000
BM_EigenCheck 20.0 ns 20.0 ns 35157117
Linux G++ - 3.4
----------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------
BM_FillMatrix3d 0.000 ns 0.000 ns 1000000000
BM_EigenCheck 20.3 ns 20.3 ns 34153155
Linux G++ - Latest
----------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------
BM_FillMatrix3d 0.000 ns 0.000 ns 1000000000
BM_EigenCheck 20.4 ns 20.4 ns 34680626