Skip to content

[x86 / Refactor] Proper allocation of FLAGS register

Summary

This merge request makes some minor changes to the code generator under x86 to ensure the FLAGS register is properly allocated when it's used in conditional branches and the like. This helps with peephole optimisation and will permit better optimisations in future.

A couple of peephole optimizations were adjusted that relied on the incorrect tracking before, but now work far better.

System

  • Processor architecture: i386, x86_64, i8086

What is the current bug behavior?

Sometimes in the peephole optimizer stage, the FLAGS register is not tracked properly. This limits the optimisations that can be made and in very rare cases may cause bad machine code to be generated.

What is the behavior after applying this patch?

The FLAGS register is now properly tracked (hopefully) around conditional statements.

Relevant logs and/or screenshots

This is a refactor, so outputted code should not change in most circumstanes. Out of the RTL and the compiler though, one change appears in the ncal under -O4, x86_64-win64 - before:

	...
.Lj779:
	movq	152(%rcx),%rax
	movzbl	113(%rax),%eax
	testl	%eax,%eax
	je	.Lj781
	cmpl	$4,%eax
.Lj781:
	seteb	%bl
	...

In this situation, proper FLAGS tracking and an adjustment to the "CMP/JE/CMP/@Lbl/SETE -> CMP/SETE/CMP/SETE/OR" optimisation allows it to be applied in a place where it was missed in the trunk - after:

	...
.Lj779:
	movq	152(%rcx),%rax
	movzbl	113(%rax),%eax
	testl	%eax,%eax
	seteb	%dl
	cmpl	$4,%eax
	seteb	%bl
	orb	%dl,%bl
	...

Merge request reports