Skip to content

x86_64: Fix to volatile register list when not under AVX512

This patch fixes a minor oversight with the list of volatile registers and thus ensures they are not listed as available when the feature set is either not enabled or not available. This is especially important since there is now command to get a free MM register in the peephole optimizer that could theoretically pick one of these registers if earlier ones are in use (see !48 (closed)).

Criteria

Confirm correct compilation and no test regressions under any CPU feature mode for x86_64.

Notes

For reasons I cannot explain, the RTL changes slightly and seems to be more efficient as a result. This might be due to how 'colours' are assigned in the register allocator. For example, in the Math unit under -O4 on the trunk:

.globl	MATH_$$_DEGNORMALIZE$SINGLE$$SINGLE
MATH_$$_DEGNORMALIZE$SINGLE$$SINGLE:
.seh_proc MATH_$$_DEGNORMALIZE$SINGLE$$SINGLE
	leaq	-56(%rsp),%rsp
.seh_stackalloc 56
	movdqa	%xmm6,32(%rsp)
.seh_savexmm %xmm6, 32
.seh_endprologue
	cvtss2sd	%xmm0,%xmm6
	mulss	_$MATH$_Ld9(%rip),%xmm0
	cvtss2sd	%xmm0,%xmm0
	call	fpc_int_real
	mulsd	_$MATH$_Ld10(%rip),%xmm0
	movapd	%xmm6,%xmm1
	subsd	%xmm0,%xmm1
	cvtsd2ss	%xmm1,%xmm0
	xorps	%xmm1,%xmm1
	comiss	%xmm1,%xmm0
	jp	.Lj65
	jnb	.Lj65
        ...

On this merge request:

.globl	MATH_$$_DEGNORMALIZE$SINGLE$$SINGLE
MATH_$$_DEGNORMALIZE$SINGLE$$SINGLE:
.seh_proc MATH_$$_DEGNORMALIZE$SINGLE$$SINGLE
	leaq	-56(%rsp),%rsp
.seh_stackalloc 56
	movdqa	%xmm6,32(%rsp)
.seh_savexmm %xmm6, 32
.seh_endprologue
	cvtss2sd	%xmm0,%xmm6
	mulss	_$MATH$_Ld9(%rip),%xmm0
	cvtss2sd	%xmm0,%xmm0
	call	fpc_int_real
	mulsd	_$MATH$_Ld10(%rip),%xmm0
	subsd	%xmm0,%xmm6
	cvtsd2ss	%xmm6,%xmm0
	xorps	%xmm1,%xmm1
	comiss	%xmm1,%xmm0
	jp	.Lj65
	jnb	.Lj65        ...

Everything I've seen so far seems to be more efficient as a result.

Merge request reports