Realm: Add cuda memcpy to realm dma
The follow up PR on the interface change:
The change refactors GPU transfer descriptor to allow launching cuda memcopies on a gpu channel. Adds the actual implementation of the copy kernel along with cuda_memcpy_test
.
Edited by apryakhin