Improve i386 SHA1Transform further.
In sha1i386.inc
,
-
(Joke part.) Use
MOVBE
s under{$ifdef}
s, so if the user suddenly wants to recompile packages with-CpCOREAVX2
, he might get better performance on CPUs whereMOVBE
is one µop (other CPUs supposedly translate it to the same µops asMOV + BSWAP
). -
(Non-joke part; though still gives nothing on my side.) Save 15 instructions (14 reads + 1 write). Rounds 0–15 and 40–59 use registers to the full extent, but rounds 17–39 and 60–79 have a spare
edx
that can pass aData
cell calculated on round X to its next use on round X + 3.
Edited by Rika