Fix use of arg function in CUDA.
A global ::arg function does not officially exist for CUDA, and fails with MSVC+C++20.
Replacing it with std::arg seems to work on device.
A global ::arg function does not officially exist for CUDA, and fails with MSVC+C++20.
Replacing it with std::arg seems to work on device.