Skip to content

Add vector `norm()` for non-subviews

Ryan Curtin requested to merge rcurtin/bandicoot-code:norm into unstable

This one quickly got out of hand.

I thought I could just use cuBLAS and clBLAS to implement norm(), but there are a multitude of different norm types (1/2/k/min/max), they apply to vectors and matrices differently, and cuBLAS and clBLAS don't support everything. In fact, clBLAS's implementation needs so much auxiliary space (2n for a vector of length n!) that I chose to just implement my own kernel for it.

A bunch of other things happened too:

  • I refactored accu(), min(), max(), and max_abs() for each backend into a function called generic_reduce(), because they all use the same general strategy but different kernels.
  • I found a bug in all the reduces implemented in OpenCL (I used get_global_size(0) instead of get_local_size(0)); this fixes some of the accu/min/max tests that were failing only on the OpenCL backend.
  • I implemented kernels for norm_1, norm_k, norm_2 (OpenCL only), norm_2_robust (OpenCL only), and norm_min.
  • For norm_max we can just use max_abs().

This MR does not cover the following things, but I'll handle them in future MRs:

  • Matrix norms.
  • Norms on subviews. (Those will require separate kernels.)

I'll merge this in a couple days in case there are any comments.

Merge request reports