Skip to content

[x86 / Bug Fix] CMOV reference bug fix

J. Gareth "Kit" Moreton requested to merge CuriousKit/optimisations:i40165 into main

Summary

This merge request fixes a bug subtly introduced in !198 (merged) (specifically 4d57dee8) where the peephole optimizer allowed a potentially unsafe reference to be used in CMOV instructions, but failed to detect if one or more of its registers got modified in between the condition and the move, thereby potentially causing null pointer deallocations or unpredictable behaviour.

System

  • Processor architecture: i386, x86_64

What is the current bug behavior?

Under some compiler configurations, TObject.GetInterfaceByStr , and a few other random functions, were malformed and would crash if called.

What is the behavior after applying this patch?

TObject.GetInterfaceByStr now works as expected.

Relevant logs and/or screenshots

For TObject.GetInterfaceByStr under x86_64-win64, -O1 -OoNOREGVAR (this didn't trigger at higher optimisation settings because the first reference tended to use a different register) - before:

	...
	movq	64(%rsp),%rax
	xorl	%ecx,%ecx
	cmpq	$0,(%rax)
	movq	64(%rsp),%rax
	movq	(%rax),%rax // <-- %rax may now contain zero
	movq	(%rax),%rax // <-- If %rax = 0, access violation
	cmoveq	%rcx,%rax
	...

(%rax) is considered permissible because it appears in the comparison instruction, but now the peephole optimizer will detect when one of its registers (%rax in this case) gets modified. When it does, it will no longer allow the reference, movq 64(%rsp),%rax is permitted, but now movq (%rax),%rax is denied because %rax has changed value, and as such the optimisation is no longer performed - after:

	...
	movq	64(%rsp),%rax
	cmpq	$0,(%rax)
	je	.Lj5223
	movq	64(%rsp),%rax
	movq	(%rax),%rax
	movq	(%rax),%rax
	jmp	.Lj5224
	.p2align 4,,10
	.p2align 3
.Lj5223:
	xorl	%eax,%eax
	jmp	.Lj5220 // (<-- this jump is inserted as part of a separate peephole optimisation)
.Lj5224:
	...

See #40165 (closed) for more information.

Outside of the main bug, a couple of undetected, potential code generation errors under -O4 were also fixed. In the Unzip unit's unzReadCurrentFile function - before:

	...
	movq	56(%rsp),%rax
	cmpl	364(%rax),%edi
	cmovgq	56(%rsp),%rax // <-- Reference gets changed
	cmovgl	364(%rax),%edi
	...

By long-winded logic, this is actually safe because %rax never actually changes value even if the condition is met (higher optimization levels will remove what becomes cmovgq 56(%rsp),%rax completely), but in some (valid) cases, such instructions aren't changed to a CMOV. After:

	...
	movq	56(%rsp),%rax
	cmpl	364(%rax),%edi
	jng	.Lj339
	movq	56(%rsp),%rax
	movl	364(%rax),%edi
.Lj339:
	...

Similarly in the jdphuff unit's start_pass_phuff_decoder procedure - before:

	...
	leal	1(%esi),%eax
	addl	$1,%esi
	xorl	%r12d,%r12d
	cmpl	$0,(%rdi,%rax,4)
	cmovnll	%esi,%eax
	cmovnll	(%rdi,%rax,4),%r12d
	...

After:

	...
	leal	1(%esi),%eax
	addl	$1,%esi
	cmpl	$0,(%rdi,%rax,4)
	jnl	.Lj37
	xorl	%r12d,%r12d
	jmp	.Lj38
	.p2align 4,,10
	.p2align 3
.Lj37:
	movl	%esi,%eax
	movl	(%rdi,%rax,4),%r12d
.Lj38:
	...

Once again, by roundabout logic, this was safe because while %esi and %eax differed only by 1, %eax wasn't modified if the conditions weren't met, but until and unless a watertight way is developed to ensure it is always safe and not lead to situations like with TObject.GetInterfaceByStr above, this is best left unoptimised.

Edited by J. Gareth "Kit" Moreton

Merge request reports