[x86] Refactoring and bug fix in OptPass2Movx
Summary
This merge request refactors OptPass2Movx slightly but also fixes a bug that can cause incorrect values to appear during overflow conditions and potentially cause buffer overruns.
System
- Processor architecture: i386, x86_64
What is the current bug behavior?
With constructs such as the following:
movzbl (ref),%eax
addl $1,%eax
movzbl %al,%eax
... the second movzbl instruction would be incorrectly removed, causing %eax to take on the value of 256 instead of 0 if the initial input was 255, for example.
What is the behavior after applying this patch?
A more correct optimisation is now performed in this case - for example:
movb (ref),%al
addb $1,%al
movzbl %al,%eax
Relevant logs and/or screenshots
In the IDE unit fpviews under x86_64-win64 -O4 - before (if %al was 255, then %rax will be set to 256 instead of 0, loading the wrong value into %ecx (if not causing an access violation) that's then sent to the Upcase routine):
movzbl %al,%eax
addl $1,%eax
movzbl 32(%rsp,%rax,1),%ecx
call SYSTEM_$$_UPCASE$CHAR$$CHAR
movb %al,32(%r12)
jmp .Lj146
.p2align 4,,10
.p2align 3
.Lj144:
...
After (if %al was 255, then the addition of 1 will wrap it around to 0, and then it's zero extended so %eax/%rax contains 0):
addb $1,%al
movzbl %al,%eax
movzbl 32(%rsp,%rax,1),%ecx
call SYSTEM_$$_UPCASE$CHAR$$CHAR
movb %al,32(%r12)
jmp .Lj146
.p2align 4,,10
.p2align 3
.Lj144:
...
For reference - this is the same block of code under -OoNOPEEPHOLE:
movzbl %al,%eax
leal 1(%eax),%eax
movzbl %al,%eax
movzbl 32(%rsp,%rax,1),%ecx
call SYSTEM_$$_UPCASE$CHAR$$CHAR
movb %al,32(%r12)
jmp .Lj146
.p2align 4,,10
.p2align 3
.Lj144:
...