[x86] Bug Fix: Incorrect optimisation in i39918 fixed
Summary
This merge request fixes a problem first identified in #39918 (closed) where an incorrect peephole optimization was performed.
System
- Processor architecture: i386, x86_64
What is the current bug behavior?
In some situations, the "AddMov2Mov" optimisation is performed when one of the registers gets modified between the two original instructions, causing an incorrect address to result in the final instruction, either triggering SIGSEGV or causing unpredictable results.
What is the behavior after applying this patch?
The optimisation is now performed in such a way that the modified register does not affect the outcome.
Relevant logs and/or screenshots
Buggy assembly dump in the new tests/webtbs/tw39918.pp test (trunk);
...
# [41] with ObjectBArray[BIndex], ObjectCArray[ObjectAArray[0].Index] do
movq U_$P$TW39918_$$_OBJECTBARRAY(%rip),%rax
# Register rdx allocated
movswq TC_$P$TW39918_$$_BINDEX(%rip),%rdx
imulq $264,%rdx,%rdx
# Peephole Optimization: Lea2AddBase done
# Peephole Optimization: AddMov2Mov done
movq U_$P$TW39918_$$_OBJECTCARRAY(%rip),%rdx
movq (%rax,%rdx),%rax ; <-- This instruction is incorrect
movl (%rax),%eax
...
With the initial bug fix (AddMov2Mov
is not performed):
...
# [41] with ObjectBArray[BIndex], ObjectCArray[ObjectAArray[0].Index] do
movq U_$P$TW39918_$$_OBJECTBARRAY(%rip),%rax
# Register rdx allocated
movswq TC_$P$TW39918_$$_BINDEX(%rip),%rdx
imulq $264,%rdx,%rdx
# Peephole Optimization: Lea2AddBase done
addq %rdx,%rax
# Register rdx released
# Register rdx allocated
movq U_$P$TW39918_$$_OBJECTCARRAY(%rip),%rdx ; <-- Second register gets modified here, so addq %rdx,%rax; movq (%rax),%rax cannot be merged.
movq (%rax),%rax
movl (%rax),%eax
...
With the adjusted optimisation:
...
# [41] with ObjectBArray[BIndex], ObjectCArray[ObjectAArray[0].Index] do
movq U_$P$TW39918_$$_OBJECTBARRAY(%rip),%rax
# Register rdx allocated
movswq TC_$P$TW39918_$$_BINDEX(%rip),%rdx
imulq $264,%rdx,%rdx
# Peephole Optimization: Lea2AddBase done
# Peephole Optimization: AddMov2Mov done (instruction moved)
movq (%rax,%rdx),%rax ; <-- MOV instruction is translocated.
# Register rdx released
# Register rdx allocated
movq U_$P$TW39918_$$_OBJECTCARRAY(%rip),%rdx
movl (%rax),%eax
...
Additional Notes
There are three commits with this merge request:
- The first adds the new test.
- The second fixes the bug so the optimisation is not performed if the second register is modified.
- The third modifies the fix so the optimisation is performed when the second register is modified, but does so by translocating the MOV instruction to appear right after the ADD instruction, so a saving can still be made (CPU port allocation might not be as efficient though).