J. Gareth "Kit" Moreton requested to merge CuriousKit/optimisations:i39918 into main Oct 13, 2022

Summary

This merge request fixes a problem first identified in #39918 (closed) where an incorrect peephole optimization was performed.

System

Processor architecture: i386, x86_64

What is the current bug behavior?

In some situations, the "AddMov2Mov" optimisation is performed when one of the registers gets modified between the two original instructions, causing an incorrect address to result in the final instruction, either triggering SIGSEGV or causing unpredictable results.

What is the behavior after applying this patch?

The optimisation is now performed in such a way that the modified register does not affect the outcome.

Relevant logs and/or screenshots

Buggy assembly dump in the new tests/webtbs/tw39918.pp test (trunk);

	...
# [41] with ObjectBArray[BIndex], ObjectCArray[ObjectAArray[0].Index] do
	movq	U_$P$TW39918_$$_OBJECTBARRAY(%rip),%rax
	# Register rdx allocated
	movswq	TC_$P$TW39918_$$_BINDEX(%rip),%rdx
	imulq	$264,%rdx,%rdx
# Peephole Optimization: Lea2AddBase done
# Peephole Optimization: AddMov2Mov done
	movq	U_$P$TW39918_$$_OBJECTCARRAY(%rip),%rdx
	movq	(%rax,%rdx),%rax ; <-- This instruction is incorrect
	movl	(%rax),%eax
	...

With the initial bug fix (AddMov2Mov is not performed):

	...
# [41] with ObjectBArray[BIndex], ObjectCArray[ObjectAArray[0].Index] do
	movq	U_$P$TW39918_$$_OBJECTBARRAY(%rip),%rax
	# Register rdx allocated
	movswq	TC_$P$TW39918_$$_BINDEX(%rip),%rdx
	imulq	$264,%rdx,%rdx
# Peephole Optimization: Lea2AddBase done
	addq	%rdx,%rax
	# Register rdx released
	# Register rdx allocated
	movq	U_$P$TW39918_$$_OBJECTCARRAY(%rip),%rdx ; <-- Second register gets modified here, so addq %rdx,%rax; movq (%rax),%rax cannot be merged.
	movq	(%rax),%rax
	movl	(%rax),%eax
	...

With the adjusted optimisation:

	...
# [41] with ObjectBArray[BIndex], ObjectCArray[ObjectAArray[0].Index] do
	movq	U_$P$TW39918_$$_OBJECTBARRAY(%rip),%rax
	# Register rdx allocated
	movswq	TC_$P$TW39918_$$_BINDEX(%rip),%rdx
	imulq	$264,%rdx,%rdx
# Peephole Optimization: Lea2AddBase done
# Peephole Optimization: AddMov2Mov done (instruction moved)
	movq	(%rax,%rdx),%rax ; <-- MOV instruction is translocated.
	# Register rdx released
	# Register rdx allocated
	movq	U_$P$TW39918_$$_OBJECTCARRAY(%rip),%rdx
	movl	(%rax),%eax
	...

Additional Notes

There are three commits with this merge request:

The first adds the new test.
The second fixes the bug so the optimisation is not performed if the second register is modified.
The third modifies the fix so the optimisation is performed when the second register is modified, but does so by translocating the MOV instruction to appear right after the ADD instruction, so a saving can still be made (CPU port allocation might not be as efficient though).

[x86] Bug Fix: Incorrect optimisation in i39918 fixed