Skip to content

WIP: Added sparse input functionality for HOSVD, and two stochastic optimisers for sparse GCP, issue #42

J D requested to merge GCHQResearch001/tensor_toolbox:master into master

Checklist

  • Issue Before the merge request, submit an issue for the change, providing as much detailed information as possible. For bug reports, please provide enough information to reproduce the problem.

  • Fork Create a branch or fork of the code and make your changes.

  • Help Comments Create or update comments for the m-files, following the style of the existing files. Be sure to explain all code options.

  • HTML Documentation For any major new functionality, please follow the following steps.

    • Add HTML documentation in the doc\html directory with the name XXX_doc.html
    • Use the MATLAB publish command to create a new file in doc\html
    • Add a pointer to this documentation file in doc\html\helptoc.xml
    • Add pointers in any related higher-level files, e.g., a new method for CP should be referenced in the cp.html file
    • Add link to HTML documentation from help comments in function
    • Update search database by running: builddocsearchdb('[full path to tensor_toolbox/doc/html directory]')
  • Tests Create or update tests in the tests directory, especially for bug fixes or strongly encouraged for new code.

  • Contents If new functions were added to a class, go to the maintenance directory and run update_classlist('Class',XXX) to add the new functions to the class XXX help information. If new functions were added at top level, go to maintenance and run update_topcontents to update the Contents.m file at the top level.

  • Release Notes Update RELEASE_NOTES.txt with any significant bug fixes or additions.

  • Contributors List Update CONTRIBUTORS.md with your name and a brief description of the contributions.

  • Pass All Tests Confirm that all tests (including existing tests) pass in tests directory.

  • Merge Request At any point, create a work-in-progress merge request, referencing the issue number and with this checklist and WIP in the header.

Hi,

I've made some contributions to the toolbox. I haven't added new functions as the contributions fit within the existing ones, so haven't included HTML documentation for this merge. In this fork the hosvd function now accepts sparse input, and gcp_opt now has the option of Nadam gcp_opt(tens,...,'opt','nadam') and RMSProp gcp_opt(tens,...,'opt','rmsprop') optimisation for sparse Tensors. I've also added a seed parameter for reproducibility. Using binary and count Tensors from FROSTT, Nadam seems to perform as well as the current default (which I guess is to be expected)- RMSProp is occasionally the best performer, but is usually outdone by Nadam and the current default.

Sparse HOSVD can use MET for memory efficient core computation- this is realised by selecting the number of modes that are to be treated element wise, hosvd(tens,..., 'ttm_esz', 2). For sparse Tensors with relatively large dimension sizes, we use a C++ mex-binded program to compute the Gram matrix of the unfolded Tensor in a somewhat memory efficient manner (one which avoids explicitly calculating the memory intensive sparse A^T matrix), and compute eigenvectors of it using either eigs or eig, depending on sparsity - I guess performing matrix multiplication before eigensolving isn't great for conditioning, but in practise during my testing things seem to work fine. Rank estimation is included in the sparse case, as is model fit.

In terms of OS compatibility- I've tested the C++ mex code in MATLAB on both Linux and Windows and it works fine, however I haven't tested it on Mac. If needs be I can add documentation- for example plots and optimisation results for GCP, or timings and memory consumption of the C++ code with large sparse matrix input, etc. Equally I can take things out of this merge if it isn't in keeping with the project.

Edited by J D

Merge request reports