[Packages] Improvement to TArrayHelper's sorting algorithm
Summary
This merge request improves the sorting algorithm (Quicksort with Heapsort if it becomes degenerate) by switching to Insertion Sort once the partition size falls below a certain threshold (set to 10 currently). This allows it to run slightly faster due to reduced overhead. This brings the algorithm more in line with Introsort.
What is the current bug behavior?
N/A
What is the behavior after applying this patch?
The TArrayHelper<>.Sort
routine gains a slight speed gain.
Relevant logs and/or screenshots
Using tqsort_killer
as a benchmark:
--- strings ---
WITHOUT INSERTION SORT WITH INSERTION SORT
---------------------- -------------------
Order: random.
n = 1500: 2.5 ms/sort, OK 2.0 ms/sort, OK
n = 3000: 5.2 ms/sort (2.1x from previous), OK 4.2 ms/sort (2.1x from previous), OK
n = 6000: 11.8 ms/sort (2.3x from previous), OK 10.2 ms/sort (2.5x from previous), OK
Order: QSort killer.
n = 1500: 5.8 ms/sort, OK 5.5 ms/sort, OK
n = 3000: 12.4 ms/sort (2.2x from previous), OK 12.4 ms/sort (2.2x from previous), OK
n = 6000: 27.6 ms/sort (2.2x from previous), OK 26.6 ms/sort (2.2x from previous), OK
--- float32's ---
WITHOUT INSERTION SORT WITH INSERTION SORT
---------------------- -------------------
Order: random.
n = 10000: 0.9 ms/sort, OK 0.8 ms/sort, OK
n = 20000: 2.0 ms/sort (2.1x from previous), OK 1.7 ms/sort (2.1x from previous), OK
n = 40000: 4.0 ms/sort (2.0x from previous), OK 3.7 ms/sort (2.2x from previous), OK
Order: QSort killer.
n = 10000: 1.4 ms/sort, OK 1.4 ms/sort, OK
n = 20000: 3.1 ms/sort (2.2x from previous), OK 3.0 ms/sort (2.1x from previous), OK
n = 40000: 6.6 ms/sort (2.1x from previous), OK 6.5 ms/sort (2.1x from previous), OK
Times are about the same for float32, but an improvement is seen with strings. Under heavy CPU loads, the improvement is more pronounced:
--- strings ---
WITHOUT INSERTION SORT WITH INSERTION SORT
---------------------- -------------------
Order: random.
n = 1500: 5.7 ms/sort, OK 3.4 ms/sort, OK
n = 3000: 12.2 ms/sort (2.1x from previous), OK 9.3 ms/sort (2.7x from previous), OK
n = 6000: 29.4 ms/sort (2.4x from previous), OK 20.4 ms/sort (2.2x from previous), OK
Order: QSort killer.
n = 1500: 14.5 ms/sort, OK 9.9 ms/sort, OK
n = 3000: 32.6 ms/sort (2.2x from previous), OK 23.2 ms/sort (2.4x from previous), OK
n = 6000: 83.3 ms/sort (2.6x from previous), OK 57.7 ms/sort (2.5x from previous), OK
--- float32's ---
WITHOUT INSERTION SORT WITH INSERTION SORT
---------------------- -------------------
Order: random.
n = 10000: 1.5 ms/sort, OK 1.2 ms/sort, OK
n = 20000: 3.2 ms/sort (2.2x from previous), OK 2.6 ms/sort (2.1x from previous), OK
n = 40000: 6.7 ms/sort (2.1x from previous), OK 6.0 ms/sort (2.3x from previous), OK
Order: QSort killer.
n = 10000: 2.7 ms/sort, OK 2.4 ms/sort, OK
n = 20000: 5.6 ms/sort (2.1x from previous), OK 5.1 ms/sort (2.1x from previous), OK
n = 40000: 11.9 ms/sort (2.1x from previous), OK 11.5 ms/sort (2.3x from previous), OK
Edited by J. Gareth "Kit" Moreton