Skip to content

Do not set EIGEN_HAS_ARM64_FP16_SCALAR_ARITHMETIC for cuda compilation

The previous version assumed that ARM and CUDA will never mix, for code that is shared between host and device this leads to miscompilation (reference to host function '__builtin_neon_vabsh_f16' in host device function)

Merge request reports

Loading