Tensor Broadcast bug on GCC and Clang with -mfma
Summary
Broadcasting an Eigen::Tensor<std::complex<double>, 1> gives the wrong result, without warnings or errors at runtime or compile-time. The problem is not caught by -fsanitize=address
Update: The error is caught by -fsanitize=address if the broadcast dimensions are increased from 2 to 4. See on compiler explorer.
Environment
- Operating System : Linux
- Architecture : x64
- Eigen Version : 3.4.0
- Compiler Version : GCC 10.1, GCC 11.2, Clang 13.0
- Compile Flags : -std=c++17 -mfma
Minimal Example
#include <string_view>
#include <unsupported/Eigen/CXX11/Tensor>
void print_tensor(const Eigen::Tensor<std::complex<double>,1> & L, std::string_view msg){
std::printf("%s\n", msg.data());
for(long i = 0; i < L.size(); i++) std::printf("(%.16f, %.16f)\n",L[i].real(), L[i].imag());
}
int main() {
Eigen::Tensor<std::complex<double>,1> L(1);
L.setConstant(1.0);
print_tensor(L, "L");
std::array<long,1> bcast = {2};
Eigen::Tensor<std::complex<double>,1> Lb = L.broadcast(bcast); // Error happens here
print_tensor(Lb, "L.broadcast({2})");
}
See it fail live on compiler explorer
Note that it works fine if one replaces std::complex<double> with double, as seen here.
Relevant logs
The program above outputs:
L
(1.0000000000000000, 0.0000000000000000)
L.broadcast({2})
(1.0000000000000000, 0.0000000000000000)
(0.0000000000000000, 0.0000000000000000)
The error is in the last line. It should be (1.0000000000000000, 0.0000000000000000)
Steps to reproduce
- Compile the code above with compiler flags
-std=c++17and-mfma(or-march=native, on my machine with skylake cpu)
What is the current bug behavior?
Broadcasting a rank-1 tensor of type std::complex<double> with contents
(1.0, 0.0)
by {2}, gives
(1.0, 0.0),
(0.0, 0.0)
In fact, the last entry seems to be uninitialized memory: under seemingly random circumstances that I can't replicate, the value is some huge nondeterministic value.
What is the expected correct behavior?
The operation above should replicate the initial value, giving
(1.0, 0.0),
(1.0, 0.0)