Skip to content

[x86] MOVXX-Op-MOVXX extension (and new utility function)

Summary

This merge request modifies the OptPass1_V_MOVAP optimisation so it uses GetNextInstructionUsingReg rather than GetNextInstruction more often. This is primarily in response to another optimisation I was working on, but proves to be quite good by itself.

To help accommodate the improved optimsiations, a new UpdateUsedRegsBetween utility routine has been introduced to ensure the registers are properly tracked so the peephole optimizer knows when it is safe to remove instructions.

System

  • Processor architecture: i386, x86_64

What is the current bug behavior?

N/A

What is the behavior after applying this patch?

More MM optimisations are performed in a fair few units.

Relevant logs and/or screenshots

In classes under -CpCOREAVX, the two vmovapd instructions get removed because the registers are not used or passed as parameters into other routines:

.section .text.n_classes$_$objecttexttobinary$tstream$tstream_$$_writeextended$double,"ax"
	.balign 16,0x90
.globl	CLASSES$_$OBJECTTEXTTOBINARY$TSTREAM$TSTREAM_$$_WRITEEXTENDED$DOUBLE
CLASSES$_$OBJECTTEXTTOBINARY$TSTREAM$TSTREAM_$$_WRITEEXTENDED$DOUBLE:
.seh_proc CLASSES$_$OBJECTTEXTTOBINARY$TSTREAM$TSTREAM_$$_WRITEEXTENDED$DOUBLE
	pushq	%rbx
.seh_pushreg %rbx
	leaq	-48(%rsp),%rsp
.seh_stackalloc 48
.seh_endprologue
	movq	%rcx,%rbx
	vmovapd	%xmm1,%xmm0 ; <-- Removed
	leaq	32(%rsp),%r8
	vmovapd	%xmm0,%xmm1 ; <-- Removed
	call	CLASSES$_$OBJECTTEXTTOBINARY$TSTREAM$TSTREAM_$$_DOUBLETOEXTENDED$DOUBLE$POINTER
	leaq	32(%rsp),%rdx
	movq	-16(%rbx),%rcx
	movl	$10,%r8d
	call	CLASSES$_$TSTREAM_$__$$_WRITEBUFFER$formal$LONGINT
	nop
	leaq	48(%rsp),%rsp
	popq	%rbx
	ret
.seh_endproc

This is the main advantage that the optimisation gives - it tends to remove unnecessary cross-writing of registers. Quite a few examples appear in the RTL.

Merge request reports