[x86] Mov0LblCmp0Jne optimisation improvement
Summary
This merge request improves the "Mov0LblCmp0Je -> Mov0JmpLblCmp0Je" optimisation to work when an alignment hint appears before the label.
System
- Processor architecture: i386, x86_64
What is the current bug behavior?
If an alignment hint appears before the label at the moment the peephole optimizer scans the code (the align may get removed later), the "Mov0LblCmp0Je -> Mov0JmpLblCmp0Je" is not performed when it's otherwise possible.
What is the behavior after applying this patch?
More occurrances of "Mov0LblCmp0Je -> Mov0JmpLblCmp0Je" should happen.
Relevant logs and/or screenshots
For a simple case, in the agx86nsm
unit of the compiler (x86_64-win64, -O4), before:
...
call ASSEMBLE$_$TEXTERNALASSEMBLEROUTPUTFILE_$__$$_ASMWRITE$SHORTSTRING
xorb %dil,%dil
.Lj124:
testb %dil,%dil
je .Lj128
...
After:
...
call ASSEMBLE$_$TEXTERNALASSEMBLEROUTPUTFILE_$__$$_ASMWRITE$SHORTSTRING
xorb %dil,%dil
jmp .Lj128
.Lj124:
testb %dil,%dil
je .Lj128
...
A cascade optimisation in the db
unit of the RTL, which is made possible by this optimisation using GetNextInstructionUsingReg
- before:
...
.Lj6170:
movb $1,%r13b
jmp .Lj6174
...
xorb %r13b,%r13b
addq $1,(%rbx) ; <-- GetNextInstructionUsingReg can skip over this instruction and look at the label.
.Lj6182:
testb %r13b,%r13b
je .Lj6170
...
After:
...
.Lj6170:
movb $1,%r13b
jmp .Lj6174
...
xorb %r13b,%r13b
addq $1,(%rbx)
movb $1,%r13b ; <-- OptPass2JMP replaces "jmp .Lj6170" with copies of the mov and the jmp that appear after that label.
jmp .Lj6174
.Lj6182:
testb %r13b,%r13b
je .Lj6170
...
Additional notes
With some additional work, the db
optimisation can be improved further, especially as xorb %r13b,%r13b
now writes a value that is immediately overwritten by movb $1,%r13b
. This is not yet caught because that is a Pass 1 optimisation while "Mov0LblCmp0Je -> Mov0JmpLblCmp0Je" is Pass 2.