Add vector `norm()` for non-subviews
This one quickly got out of hand.
I thought I could just use cuBLAS and clBLAS to implement norm()
, but there are a multitude of different norm types (1/2/k/min/max), they apply to vectors and matrices differently, and cuBLAS and clBLAS don't support everything. In fact, clBLAS's implementation needs so much auxiliary space (2n
for a vector of length n
!) that I chose to just implement my own kernel for it.
A bunch of other things happened too:
- I refactored
accu()
,min()
,max()
, andmax_abs()
for each backend into a function calledgeneric_reduce()
, because they all use the same general strategy but different kernels. - I found a bug in all the reduces implemented in OpenCL (I used
get_global_size(0)
instead ofget_local_size(0)
); this fixes some of the accu/min/max tests that were failing only on the OpenCL backend. - I implemented kernels for
norm_1
,norm_k
,norm_2
(OpenCL only),norm_2_robust
(OpenCL only), andnorm_min
. - For
norm_max
we can just usemax_abs()
.
This MR does not cover the following things, but I'll handle them in future MRs:
- Matrix norms.
- Norms on subviews. (Those will require separate kernels.)
I'll merge this in a couple days in case there are any comments.