ERMS branch for backward moves (i386).
There is an idea from Intel optimization manual regarding overlapping-Move
-to-the-right cases that require performing the move backwards (right to left):
Everyone ignores it, probably because the case of inserting >ErmsThreshold
bytes into the beginning of the array is quite rare. Still, this benchmark: BackwardMoveBenchmark.pas (tests System.Move
) speeds up as follows:
Before | After | |
---|---|---|
Move(src=0, dst=2000, count=98000) |
2.0 µs/call | 1.7 µs/call |
100 prependings of 2000-character string | 208 µs/call | 184 µs/call |
Move(src=0, dst=5000, count=245000) |
5.2 µs/call | 4.1 µs/call |
100 prependings of 5000-character string | 787 µs/call | 716 µs/call |
Speaking of best cases: Move(src=0, dst=20000, count=30000)
speeds up by 60% (870→540 ns/call).
If you want to “thoroughly” test for correctness with something like for src := 0 to 300 do for dst := 0 to 300 do for count := 0 to 300 do
, decrease ErmsThreshold
to 32 (this is a minimum, due to the way checks are performed).
Includes !560 (closed). :>