[x86] POPCNT and extraneous MOV optimisations (!123) · Merge requests · FPC / FPC / FPC Source

Summary

This merge request adds some optimisations for POPCNT and LZCNT instructions and ones closely related:

for "popcnt %reg1,%reg2; test %reg2,%reg2", the test instruction gets removed (it also works for "test %reg1,%reg1"). Similarly for "lzcnt %reg1,%reg2; test %reg2,%reg2" (although not "test %reg1,%reg1" this time). This is a simple extension of PostPeepholeOptTestOr.
At the end of OptPass1MOV there is now a 'backward optimisation" that looks for "func (oper),%reg1; mov %reg1,%reg2; (dealloc %reg1)" and changes it to "func (oper),%reg2". In this instance 'func' is any operation other than CMOV (because it might not write to the destination) with Rop1 and Wop2 flags. It was originally designed to optimise POPCNT assignments, but it also optimises things like "cvtsd2si %xmm0,%rax; movq %rax,%rcx".

N/A

Because the POPCNT/TEST optimisation is not triggered anywhere in the compiler, RTL or packages, a couple of new tests have been introduced to evaluate the optimisations, namely "tests/test/opt/tpopcnt1.pp" and "tests/test/opt/tpopcnt2.pp"

The Variants unit under "-CpCOREAVX -OpCOREAVX -CfAVX" receives the most improvement with many optimisations like the following - before:

	vdivsd	%xmm0,%xmm6,%xmm0
	vmulsd	%xmm0,%xmm8,%xmm0
	vcvtsd2si	%xmm0,%rax
	movq	%rax,%rbx
	jmp	.Lj689
	.p2align 4,,10
	.p2align 3
.Lj708:

After:

	vdivsd	%xmm0,%xmm6,%xmm0
	vmulsd	%xmm0,%xmm8,%xmm0
	vcvtsd2si	%xmm0,%rbx
	jmp	.Lj689
	.p2align 4,,10
	.p2align 3
.Lj708: