Skip to content

Faster val(str, enum).

Rika requested to merge runewalsh/source:valenum into main

Iterate input characters one by one and gradually narrow the range. This allows for more natural tracking of “code” and is often faster.

Let input be “abd” and sorted names be  _hm a aa ab aba abb abc abd ac ad b c.
Start:                                 └_hm a aa ab aba abb abc abd ac ad b c┘
After iteration 0 (“a” analyzed):          └a aa ab aba abb abc abd ac ad┘
After iteration 1 (“ab” analyzed):              └ab aba abb abc abd┘
After iteration 2 (“abd” analyzed):                            └abd┘

Benchmark: ValEnumBenchmark.pas.

My results:

                        i386 before     after         x86-64 before     after
Val(ImageFormat):       104 ns/call   93 ns/call        92 ns/call    87 ns/call  :(
Val(TPlatform):         151 ns/call   70 ns/call       129 ns/call    66 ns/call
Val(TChunkTypes):       104 ns/call   54 ns/call        88 ns/call    51 ns/call
Val(RandomEnum5_0):      44 ns/call   18 ns/call        37 ns/call    18 ns/call
Val(RandomEnum5_1):      37 ns/call   18 ns/call        32 ns/call    18 ns/call
Val(RandomEnum5_2):      38 ns/call   20 ns/call        33 ns/call    19 ns/call
Val(RandomEnum10_0):     53 ns/call   24 ns/call        44 ns/call    23 ns/call
Val(RandomEnum10_1):     51 ns/call   27 ns/call        42 ns/call    26 ns/call
Val(RandomEnum10_2):     48 ns/call   25 ns/call        39 ns/call    24 ns/call
Val(RandomEnum15_0):     58 ns/call   30 ns/call        48 ns/call    29 ns/call
Val(RandomEnum15_1):     55 ns/call   30 ns/call        45 ns/call    30 ns/call
Val(RandomEnum15_2):     53 ns/call   30 ns/call        46 ns/call    29 ns/call

Merge request reports