[x86] Fixed bad register tracking in OptPass2JMP (fixes #40129) (!370) · Merge requests · FPC / FPC / FPC Source

J. Gareth "Kit" Moreton requested to merge CuriousKit/optimisations:i40129 into main Feb 02, 2023

Summary

This merge request fixes a bug that first manifested thanks to !77 (merged) where a nested optimisation attempted to simplify a future instruction in order to improve Pass 2 performance, but which did so using incorrect register tracking (it used the tracking state from the current instruction instead of the future instruction).

System

Processor architecture: i386. x86_64

What is the current bug behavior?

The example in #40129 (closed), and a routine in uriparser are incorrectly optimised.

What is the behavior after applying this patch?

Both examples should now be correct.

Relevant logs and/or screenshots

To showcase the erroneous code in uriparser (i386-win32, -O4) - before:

	...
.Lj72:
	movzbl	%al,%eax
# Peephole Optimization: SubMov2Lea
	leal	-87(%eax),%edx
# Peephole Optimization: Duplicated 1 assignment(s) and redirected jump
# Peephole Optimization: %edx = %eax; changed to minimise pipeline stall (MovXXX2MovXXX)
# Peephole Optimization: Mov2Nop 4 done
	# Register eax released <-- %eax released too soon, triggering SubMov2Lea instead of SubMov2LeaSub
	ret
	.balign 16,0x90
.Lj69:
	...

After:

	...
.Lj72:
	movzbl	%al,%eax
	leal	-87(%eax),%edx
# Peephole Optimization: SubMov2LeaSub
	subl	$87,%eax
# Peephole Optimization: Duplicated 1 assignment(s) and redirected jump
# Peephole Optimization: %edx = %eax; changed to minimise pipeline stall (MovXXX2MovXXX)
# Peephole Optimization: Mov2Nop 4 done
	ret
	.balign 16,0x90
	# Register eax released <-- %eax released in the proper place
.Lj69:
	...

[x86] Fixed bad register tracking in OptPass2JMP (fixes #40129)

Summary

System

What is the current bug behavior?

What is the behavior after applying this patch?

Relevant logs and/or screenshots

Merge request reports