Skip to content

Apply recent x86_64.inc ideas to i386.inc, and a bit on top.

Rika requested to merge runewalsh/source:i386-nsf into main

I went into i386.inc for business and noticed the following possibilities:

  • As in !509 (merged), certain Interlocked* have strange easily eliminable xchgs. Also make them nostackframe.

  • As in !426 (merged), cpu*locked operations that check IsMultiThread can jump over a LOCK prefix, using a second parameter to receive IsMultiThread in a high-level manner. Should be easier to inline.

  • BsfQWord has unused .L1, and also it could completely mirror BsrDWord but does not, and I think BsrQWord approach with an additional replacement of jmp <end> to direct ret $8 is better because it then takes zero jumps in one of the two common cases (“primary” half is nonzero), one jump in the other common case (“primary” half is zero, “secondary” is nonzero), and two jumps in the rare to impossible case (input = 0), while BsfQWord is geared toward input = 0.

  • SarInt64 can: ignore the possibility of Shift > 63 (such shifts are undefined I hope?), but rely on that x86 sar cl, r32 uses only 5 lower bits of cl (so omit “and cl, 31”), and also (arguably...) skip reading the lower half unless shift is indeed less than 32.

  • InterlockedCompareExchange64 is not published and seemingly not used by anyone. I made a separate MR.

Edited by Rika

Merge request reports