Add LARFG implementation and LAPACK bones (!19) · Merge requests · bandicoot-lib / bandicoot-code

Ryan Curtin requested to merge rcurtin/bandicoot-code:larfg into unstable Apr 24, 2021

As a start towards #6, I began to implement/adapt code from clMAGMA. Eigenvalue decompositions (symmetric ones at least) are done via SYEVD, which in turn depends on LATRD, which in turn depends on LARFG.

So, this is a quick implementation of LARFG, plus all the auxiliary code necessary to make implementing further LAPACK functionality easy.

I'm not sure how this implementation will perform, but it at least works. I think the time for performance tuning will be once the final SYEVD implementation is done, and then we can compare the top-level eigenvalue decomposition vs. other toolkits, and if LARFG is slow, it will show up as a bottleneck.

This also fixes some minor bugs I discovered with the dot() implementation.

LARFG has support to handle when the given vector has values that are too small, but, I commented this out, as dot() does not, so it isn't possible to actually activate the scaling condition I implemented. :) So, that could be uncommented once dot() is more robust, perhaps.

Add LARFG implementation and LAPACK bones

Merge request reports