Number of minimal volume enclosing ellipsoids (MVEE)

Status

Currently MVEE are determined by k-means clustering. Number of clusters k is determined in nclus function, by running k from min(dim, floor(n / 2*(dim+1))) to 1 until each cluster contains at least 2*(dim+1) points, where n is the number of points (one cluster and warning on failure).

Issues

Minimization of the rejection volume

Consider, for all clusterizations satisfying above criteria, and an additional criteria of a minimal gap between ellipsoids boundaries and the actual points (or, possibly related, minimal overlap between ellipsoids). This would vastly improve the rejection sampling from the viable volume in high dimensions as done by intelintg2 (used by Volint); cf. a 2-D ball example, where with a gap criteria there could be a single, perfect ball cluster:

Black points are the initial sample. Here, they were produced uniformly form the enclosing square (TODO test when produced by: OEAMC+MEBS).

It is unclear whether this would improve MEBS sampling, based on idea of growing the ellipsoids.

Also note that principle axes of the resulting ellipsoids could be more meaningful, and thus much more useful in the parameter space analysis; cf. the toy example from TopoFilter:

Lower bound

For large enough samples, in practice the number of clusters will be always equal to dim. This conservative approach avoids the issue with a lot of small clusters, for which MVEE may cut out significant volumes of the true viable, causing errors; cf. the toy example from TopoFilter - note exemplary cutouts on the right side, where viable space is rectangular:

There seem to be no justification for dim upper bound. For instance, consider viable space with three distant, unconnected regions in dim=2.

Discussion

Connected regions could be detected by MVEE overlap and treated separately (TODO: separate issue).
Doesn't it make more sense to put multiple random seed repeats for k-means directly in the clusterize procedure, as opposed to in MEBS (cf. #11, in particular the Adaptive Sample Size comment)? In the first toy example above, if the initial point for lower cluster would be in the lower left corner, one would get the expected "optimal" clusterization wrt MVEE.
Double check that MEBS does not already merge ellipsoids (first check: no, it doesn't), and if this could actually improve anything in MEBS.
Merging and splitting clusters based on MVEE seems the way to go (both operations at once if at the upper bound of number of cluster).
Computing rejected volume is not trivial. It could be approximated, for instance, by finding a convex hull for points, or by calling homosamp and counting dropped samples (more extra computation time).