FMA Pass
Add an fma pass that implements fma
function in our optimization module and the pass performs the following transformations:
a+b*c -> fma(a, b, c)
x*(1/pi) + 0.5_dp*sign(1._dp, x) -> fma(0.5_dp*sign(1._dp, x), x, 1/pi)
x - Nd*pi -> fma(x, -Nd, pi)
S1+z*S2 -> fma(S1, z, S2)
And then in the LLVM backend we turn fma
into an LLVM fma instruction.