Post-modern CompareByte for x86-64/SSE2.
Copy of !397 (merged) with renamed registers etc.
Benchmark: CompareBytePostmodernBenchmarkX64.pas.
My results 🌼
SSE2 (modern) SSE2 (postmodern)
CompareByte(#0 / 1): 2.1 ns/call 1.7 ns/call
CompareByte(#6 / 7): 2.5 ns/call 2.1 ns/call
CompareByte(#14 / 15): 2.4 ns/call 2.1 ns/call
CompareByte(#30 / 31): 2.9 ns/call 2.4 ns/call
CompareByte(#1 / 100): 2.0 ns/call 1.8 ns/call
CompareByte(#50 / 100): 4.7 ns/call 3.5 ns/call
CompareByte(#99 / 100): 6.8 ns/call 5.1 ns/call
CompareByte(#100 / 200): 6.5 ns/call 5.1 ns/call
CompareByte(#199 / 200): 12 ns/call 6.5 ns/call
CompareByte(#500 / 1000): 32 ns/call 12 ns/call
CompareByte(#999 / 1000): 47 ns/call 20 ns/call
CompareByte(#5000 / 10000): 232 ns/call 124 ns/call
CompareByte(#9999 / 10000): 444 ns/call 251 ns/call
Edited by Rika