Eigen Slow in Docker Container
When running the same test on my machine (Ubuntu 16.04), vs in a Docker container (Base image is Ubuntu 16.04), the test runs much slower (7s vs 241s), after profiling with gprof, I found out almost all of the time when run within the container is spent in the following function, using Eigen 3.3.7:
void gebp_kernel<LhsScalar,RhsScalar,Index,DataMapper,mr,nr,ConjugateLhs,ConjugateRhs>
::operator()(const DataMapper& res, const LhsScalar* blockA, const RhsScalar* blockB,
Index rows, Index depth, Index cols, ResScalar alpha,
Index strideA, Index strideB, Index offsetA, Index offsetB)
which is called by:
Eigen::internal::triangular_solve_matrix<double, long, 1, 2, false, 0, 0>::run(long, long, double const*, long, double*, long, Eigen::internal::level3_blocking<double, double>&)
Any ideas why this slow down might occur? The only thing I have tried so far is setting the default L1,L2,L3 cache sizes in GeneralBlockPanelKernel.h after reading this: https://gitlab.com/arm-hpc/packages/-/wikis/packages/tensorflow#setting-cache-sizes-for-eigen-gebp-kernel. However, the test runs just as slow regardless of setting that. The container does not have any CPU limitations, and is allowed to use all the CPUs and GPUs.