Performance degradation of matrix multiplication inside for cycle in 3.3 compared to 3.2
Submitted by Leonid Grechishnikov
Assigned to Nobody
Link to original bugzilla bug (#1670)
Version: 3.3 (current stable)
Operating system: Windows
Description
Created attachment 919
Benchmark for matrix multiplication
Attached is a benchmark which works slower with Eigen 3.3 than with Eigen 3.2. Performance degradation is observed at least for MSVC (~2-3 times slower) and Intel Compiler (~1.5-3 times slower).
It seems that matrix-vector or vector-vector multiplication inside for cycle is worse optimized by compiler since version 3.3 (less functions are inlined).
Processor: Intel(R) Core(TM) i7-7700 CPU @ 3.6GHz 3.60 GHz
Memory (RAM): 16.0 GB
Windows 10 x64
Visual Studio 2017 with MSVC compiler command line:
/GS /W1 /Zc:wchar_t /I"C:\Users\lgrechishnikov\source\repos\TestEigenPerformance\TestEigenPerformance" /Zi /Gm- /O2 /Fd"x64\Release\vc141.pdb" /Zc:inline /fp:precise /errorReport:prompt /WX- /Zc:forScope /Gd /MD /FC /Fa"x64\Release" /EHsc /nologo /Fo"x64\Release" /Fp"x64\Release\TestEigenPerformance.pch" /diagnostics:classic
Visual Studio 2017 with Intel Compiler 2018 command line:
/GS /W1 /Zc:wchar_t /I"C:\Users\lgrechishnikov\source\repos\TestEigenPerformance\TestEigenPerformance" /Zi /O2 /Fd"x64\Release\vc141.pdb" /fp:precise /Zc:forScope /MD /FC /Fa"x64\Release" /EHsc /nologo /Fo"x64\Release" /Qprof-dir "x64\Release" /Fp"x64\Release\TestEigenPerformance.pch"
Processing times:
MSVC, Eigen 3.2.4 - 5.0 ms
MSVC, Eigen 3.3.7 - 16.2 ms
Intel Compiler, Eigen 3.2.4 - 4.52 ms
Intel Compiler, Eigen 3.3.7 - 17.17 ms
Attachment 919, "Benchmark for matrix multiplication":
file_1670.txt