ploaddup using _mm_load_sd, which is generally miscompiled on gcc/i386
Submitted by Benoit Jacob
Assigned to Gael Guennebaud @ggael
Link to original bugzilla bug (#200)
Description
As we've found out on bug #195 (closed), GCC (at least up to 4.4) on i386 (i.e. -m32) miscompiles the _mm_load_sd intrinsic in that it adds redundant x87 fldl/fstpl instructions, which should result in poor performance (in bug #195 (closed), it even resulted in a wrong result bug, but that's a different story).
Our ploaddup function is still using _mm_load_sd, so it would be nice to have a work-around for gcc/i386 not using it.
Edited by Rasmus Munk Larsen