Update to HEXL 1.2.0
Fixes #322 (closed)
Configuring using -DWITH_OPENMP=OFF -DCMAKE_CXX_COMPILER=clang++-13 -DCMAKE_C_COMPILER=clang-13 -DWITH_NATIVEOPT=ON
(including the fix to clang-13 compilation here)
and running on ICX yields the below benchmark results.
One major feature of HEXL v1.2 is a speedup on larger (~16384 coeffs) NTTs, which we see in the poly-hexl-benchmark-16k
benchmark. HEXL isn't used to accelerate addition, so the changes in the addition benchmarks are noise.
numactl --physcpubind=0 ./bin/benchmark/lib-hexl-benchmark --benchmark_min_time=2
numactl --physcpubind=0 ./bin/benchmark/poly-hexl-benchmark-4k --benchmark_min_time=2
numactl --physcpubind=0 ./bin/benchmark/poly-hexl-benchmark-16k --benchmark_min_time=2
Edited by Fabian Boemer