Remove diagonal kernels
Related to the subview optimizations, all kernels I had written that operate on diagonals can be simply replaced with calls to subview kernels. This is because a diagonal can be seen as a subvector of a matrix m
whose stride is m.n_rows + 1
. So:
-
copy_diag()
->copy_mat()
-
set_diag()
->copy_mat()
-
extract_diag()
->copy_mat()
-
inplace_op_diag()
->eop_scalar()
- all diagonal-specific kernels are removed
- the OpenCL
copy_mat()
function no longer usesclEnqueueCopyBufferRect
but instead the twoway copy kernel, as some OpenCL implementations view this "fake" stride as something that will cause an out-of-bounds access even when it won't - some additional cleanups here and there of filenames or function names