Skip to content

[x86] Refactoring and bug fix in OptPass2Movx

Summary

This merge request refactors OptPass2Movx slightly but also fixes a bug that can cause incorrect values to appear during overflow conditions and potentially cause buffer overruns.

System

  • Processor architecture: i386, x86_64

What is the current bug behavior?

With constructs such as the following:

movzbl (ref),%eax
addl   $1,%eax
movzbl %al,%eax

... the second movzbl instruction would be incorrectly removed, causing %eax to take on the value of 256 instead of 0 if the initial input was 255, for example.

What is the behavior after applying this patch?

A more correct optimisation is now performed in this case - for example:

movb   (ref),%al
addb   $1,%al
movzbl %al,%eax

Relevant logs and/or screenshots

In the IDE unit fpviews under x86_64-win64 -O4 - before (if %al was 255, then %rax will be set to 256 instead of 0, loading the wrong value into %ecx (if not causing an access violation) that's then sent to the Upcase routine):

	movzbl	%al,%eax
	addl	$1,%eax
	movzbl	32(%rsp,%rax,1),%ecx
	call	SYSTEM_$$_UPCASE$CHAR$$CHAR
	movb	%al,32(%r12)
	jmp	.Lj146
	.p2align 4,,10
	.p2align 3
.Lj144:
        ...

After (if %al was 255, then the addition of 1 will wrap it around to 0, and then it's zero extended so %eax/%rax contains 0):

	addb	$1,%al
	movzbl	%al,%eax
	movzbl	32(%rsp,%rax,1),%ecx
	call	SYSTEM_$$_UPCASE$CHAR$$CHAR
	movb	%al,32(%r12)
	jmp	.Lj146
	.p2align 4,,10
	.p2align 3
.Lj144:
        ...

For reference - this is the same block of code under -OoNOPEEPHOLE:

	movzbl	%al,%eax
	leal	1(%eax),%eax
	movzbl	%al,%eax
	movzbl	32(%rsp,%rax,1),%ecx
	call	SYSTEM_$$_UPCASE$CHAR$$CHAR
	movb	%al,32(%r12)
	jmp	.Lj146
	.p2align 4,,10
	.p2align 3
.Lj144:
        ...
Edited by J. Gareth "Kit" Moreton

Merge request reports