improve sparse permutations (!1164) · Merge requests · libeigen / eigen

Reference issue

What does this implement/fix?

Currently, sparse permutations are performed in the following manner:

the right hand side expression is evaluated into a sparse matrix temporary (allocation)
the result is created in a new sparse matrix temporary (allocation)
the result is assigned to the destination (allocation)

If the inverse permutation is requested, then the inverse permutation is computed (allocation)

If the inner indices are permuted, the transpose of the matrix is constructed so that the transpose assignment implicitly sorts the new inner indices, which requires random access and is considerably slower.

This MR seeks to reduce the number of allocations and improve performance. The new sequence of events is:

if the right hand side is a plain sparse object, just reference it. otherwise, evaluate it into a temporary (no allocation in many cases)
the result is created in a new sparse matrix temporary (allocation)
the result is moved to the destination (no allocation if the objects are compatible, i.e. storage orders match)

If the inner indices are permuted, and the inverse permutation is requested, then the inverse permutation is computed (allocation in one of the four use cases). If the inner indices are permuted, the elements with their updated indices are inserted into the result in an unsorted fashion. the indices are sorted in place after the matrix is finalized.

In summary, the previous strategy always had 3 copies of the matrix and 50% of the use cases (outer/inner inverse permutation) requires a copy of the permutation. Now, permuting a plain matrix involves 1 copy, and 25% of use cases (inner inverse) requires a copy of the permutation. In all cases, the permuted matrix is now constructed in a manner that allows contiguous chunks of data to be copied, instead of elements one-at-a-time in random order.

Operation	Before	After	% Change
Outer	7196	3847	-47%
Inner	16125	4069	-75%
Inverse Outer	6895	4433	-36%
Inverse Inner	15918	4257	-73%

Note that the largest improvement is due to avoiding the construction of the transposed matrix (inner permutations). All cases benefit from avoiding two fewer copies of the matrix (reference xpr, move result).

Additional information

Edited Jan 14, 2023 by Charles Schlosser

improve sparse permutations

Reference issue

What does this implement/fix?

Additional information

Merge request reports