IndexWord and IndexDWord for i386 without REP SCAS.
Simplistic reimplementations of IndexWord
and IndexDWord
for i386
without the dreaded REP SCAS
.
Benchmark: IndexWordDWordBenchmark.pas.
My results:
System.IndexWord System.IndexDWord
#0 / 1: 15 ns/call #0 / 1: 15 ns/call
#9 / 10: 19 ns/call #9 / 10: 19 ns/call
#99 / 100: 61 ns/call #99 / 100: 61 ns/call
#999 / 1000: 466 ns/call #999 / 1000: 485 ns/call
IndexWordGeneric IndexDWordGeneric
#0 / 1: 2.3 ns/call #0 / 1: 2.4 ns/call
#9 / 10: 6.9 ns/call #9 / 10: 5.1 ns/call
#99 / 100: 56 ns/call #99 / 100: 39 ns/call
#999 / 1000: 475 ns/call #999 / 1000: 255 ns/call
IndexWordAsm IndexDWordAsm
#0 / 1: 1.9 ns/call #0 / 1: 1.9 ns/call
#9 / 10: 4.4 ns/call #9 / 10: 4.6 ns/call
#99 / 100: 38 ns/call #99 / 100: 37 ns/call
#999 / 1000: 250 ns/call #999 / 1000: 252 ns/call
I wouldn't put too much faith in 2× on large arrays, but large startup cost of REP SCAS
that can outweigh all remaining work on small arrays is a known thing.
Edited by Rika