Skip to content

SYCL: Add optimized atomicAdd flavor for AMD GPUs

Andrey Alekseenko requested to merge aa-amd-atomics into hwe-release-2022

AMD MI100 (gfx908) and MI200 (gfx90a) by default use CAS-loops for implementing floating-point atomics. Here, we call special flavors of atomicAdd that compile to native instructions, improving performance on these devices. AMD MI50 (gfx906) always uses CAS-loop.

Refs #4465 Refs #3935, #3965 (closed)

Edited by Andrey Alekseenko

Merge request reports