Skip to content

Disable cuda Eigen::half vectorization on host.

All cuda __half functions are device-only in CUDA 9, including conversions. Host-side conversions were added in CUDA 10. The existing code doesn't build prior to 10.0.

All arithmetic functions are always device-only, so there's therefore no reason to use vectorization on the host at all.

Modified the code to disable vectorization for __half on host, which required also updating the TensorReductionGpu implementation which previously made assumptions about available packets.

Merge request reports

Loading