More numerical stability for Math.

psum.patch

Math sums N floats from 0 to N−1 accumulating up to O(N) (O(√N) in average if the wiki is right) round-off errors. Simple pairwise summation method reduces this to O(log N), and, unlike its big brothers, has no speed penalty (not on my computer around given cutoff). For reference, Kahan summation is marginally better and 4× slower for obvious reasons, and Python uses perfect algorithm for math.fsum(), even more complex and even slower.

Demonstration that sums up 1 with N repeats of 0.05 / N (so that exact result, ignoring the inexactness of the calculated 0.05 / N itself, must be 1.05) using both methods:

demo.pas

My output on x86-64/win64:

sizeof(float) = 8

--- 1 + 0.05/100000 + ... x100000... + 0.05/100000 ---

sum_linear =  1.0500000000069889E+000
31475 ULP away from 1.05

sum_pairwise =  1.0500000000000003E+000
1 ULP away from 1.05

--- 1 + 0.05/1000000 + ... x1000000... + 0.05/1000000 ---

sum_linear =  1.0499999999181711E+000
368525 ULP away from 1.05

sum_pairwise =  1.0499999999999994E+000
3 ULP away from 1.05

--- 1 + 0.05/10000000 + ... x10000000... + 0.05/10000000 ---

sum_linear =  1.0499999996961265E+000
1368525 ULP away from 1.05

sum_pairwise =  1.0499999999999998E+000
1 ULP away from 1.05

--- 1 + 0.05/100000000 + ... x100000000... + 0.05/100000000 ---

sum_linear =  1.0500000041370186E+000
18631475 ULP away from 1.05

sum_pairwise =  1.0500000000000005E+000
2 ULP away from 1.05

Test that covers all functions changed:

test.pas

BTW: can generics be used to reduce code repetition, or it has something to do with older compiler versions?

Edited Aug 14, 2022 by Rika

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information