[x86] Fixed bad register tracking in OptPass2JMP (fixes #40129)
Summary
This merge request fixes a bug that first manifested thanks to !77 (merged) where a nested optimisation attempted to simplify a future instruction in order to improve Pass 2 performance, but which did so using incorrect register tracking (it used the tracking state from the current instruction instead of the future instruction).
System
- Processor architecture: i386. x86_64
What is the current bug behavior?
The example in #40129 (closed), and a routine in uriparser
are incorrectly optimised.
What is the behavior after applying this patch?
Both examples should now be correct.
Relevant logs and/or screenshots
To showcase the erroneous code in uriparser
(i386-win32, -O4) - before:
...
.Lj72:
movzbl %al,%eax
# Peephole Optimization: SubMov2Lea
leal -87(%eax),%edx
# Peephole Optimization: Duplicated 1 assignment(s) and redirected jump
# Peephole Optimization: %edx = %eax; changed to minimise pipeline stall (MovXXX2MovXXX)
# Peephole Optimization: Mov2Nop 4 done
# Register eax released <-- %eax released too soon, triggering SubMov2Lea instead of SubMov2LeaSub
ret
.balign 16,0x90
.Lj69:
...
After:
...
.Lj72:
movzbl %al,%eax
leal -87(%eax),%edx
# Peephole Optimization: SubMov2LeaSub
subl $87,%eax
# Peephole Optimization: Duplicated 1 assignment(s) and redirected jump
# Peephole Optimization: %edx = %eax; changed to minimise pipeline stall (MovXXX2MovXXX)
# Peephole Optimization: Mov2Nop 4 done
ret
.balign 16,0x90
# Register eax released <-- %eax released in the proper place
.Lj69:
...