[x86 / Refactor] Proper allocation of FLAGS register
Summary
This merge request makes some minor changes to the code generator under x86 to ensure the FLAGS register is properly allocated when it's used in conditional branches and the like. This helps with peephole optimisation and will permit better optimisations in future.
A couple of peephole optimizations were adjusted that relied on the incorrect tracking before, but now work far better.
System
- Processor architecture: i386, x86_64, i8086
What is the current bug behavior?
Sometimes in the peephole optimizer stage, the FLAGS register is not tracked properly. This limits the optimisations that can be made and in very rare cases may cause bad machine code to be generated.
What is the behavior after applying this patch?
The FLAGS register is now properly tracked (hopefully) around conditional statements.
Relevant logs and/or screenshots
This is a refactor, so outputted code should not change in most circumstanes. Out of the RTL and the compiler though, one change appears in the ncal under -O4, x86_64-win64 - before:
...
.Lj779:
movq 152(%rcx),%rax
movzbl 113(%rax),%eax
testl %eax,%eax
je .Lj781
cmpl $4,%eax
.Lj781:
seteb %bl
...
In this situation, proper FLAGS tracking and an adjustment to the "CMP/JE/CMP/@Lbl/SETE -> CMP/SETE/CMP/SETE/OR" optimisation allows it to be applied in a place where it was missed in the trunk - after:
...
.Lj779:
movq 152(%rcx),%rax
movzbl 113(%rax),%eax
testl %eax,%eax
seteb %dl
cmpl $4,%eax
seteb %bl
orb %dl,%bl
...