Realm: Merge cuda-dma to master
This pull request enhances the DMA engine by enabling native execution of CUDA kernels inside Realm. This integration accelerates memory copies of various types such as: structured copies with multiple affine rectangles, transposes, scatters, gathers and fills.
Edited by apryakhin