Add linspace() implementations for CUDA and OpenCL, plus tests
This is basically @zoq's implementation in the ens
branch of linspace()
, plus an OpenCL adaptation of the kernel. I simplified the CUDA kernel slightly.
I'll let this sit for a couple days before merge. It's pretty simple overall.