Crash with 3.3.1: Bad code generation by optimizer: reg-var + peephole / Intel 64bit - using uninitialized register as source
Summary
The optimizer produces bad asm code in the user app.
Using registers for local vars, the peephole swaps some registers, but later uses the original register which then is not initialized.
System Information
- Operating system: Windows 10 / 64 bit
- Processor architecture: x86, Intel 64 bit
- Compiler version: 3.3.1 d3ac07ad
Steps to reproduce
Compile the example.
Look at the asm code for procedure TLazFixedRoundBufferListMemBase.MoveBytesToOther(AFromByteOffset, AToByteOffset, AByteCount, AByteCap: Integer; AnOther: PLazFixedRoundBufferListMemBase);
Must be -O3 or -O4
Register r10 is used (value read from r10) without ever having been assigned
r10 does not get used/initialized in that code, but in
# Var OtherMPtr located in register r15
# Var PTarget located in register rdx
# Register r10 released
# [1728] DstHigh := PTarget + AByteCap;
cltq
leaq 16(%rax,%r10),%rdi
It is used as value for OtherMPtr.
-
raxis AByteCap` - The line is merged with
PTarget := @OtherMPtr^.Data;and Data is at `@(OtherMPtr+16)^
Comparing asm (snippets) with/without peephole
Block 1 - At the start (getting the value for OtherMPtr)
- Good code with peephole off
# Register r9 allocated
# [1720] OtherMPtr := AnOther^.FMem.FMem;
movq 56(%rbp),%r9
# Var AnOther located in register r9
# Var OtherMPtr located in register r10
# Register r10 allocated
movq (%r9),%r10
# Register r9 released
- Bad code with peephole on
# [1720] OtherMPtr := AnOther^.FMem.FMem;
movq 56(%rbp),%r9
# Var AnOther located in register r9
# Var OtherMPtr located in register r10
# Register r10,r15,r15 allocated
movq (%r9),%r15
# Register r9 released
- Note:
-
- With peephole
OtherMPtrwill be in r15
- With peephole
-
-
Var OtherMPtr located in register r10is wrong
-
-
-
# Register r10,r15,r15 allocatedr10 is still marked allocated / r15 is marked allocated twice
-
Block 2 - The code that is wrong above
Below is the code without peephole. This is the correct version of the first asm snippet (with the wrong non initialized r10 usage).
- Here
r10has the value ofOtherMPtr - PTarget is correctly loaded (the asm for that line is not merged with the next line)
- the faulty line then uses PTarget in rdx
# Register r15 allocated
# [1727] PTarget := @OtherMPtr^.Data;
movq %r10,%r15
# Register r10 released
# Var OtherMPtr located in register r15
# Register rdx allocated
leaq 16(%r15),%rdx
# Var PTarget located in register rdx
# [1728] DstHigh := PTarget + AByteCap;
movslq %eax,%rax
leaq (%rax,%rdx),%rax
Note that without the peephole, OtherMPtr is in r10. But gets moved to r15 before PTarget is read.
- With the peephole,
OtherMPtrwill go straight intor15in "block 1" (so that is good, saves the register move) - But the peephole then apparently (I guess) gets it wrong when combining the asm from the 2 lines.
-
- Instead of using
r15it usesr10ignoring that it already changed that register.
- Instead of using
Example Project
Attached crash_opt_reg.zip
- Compile with
fpc.exe -O3 -alr project1.lpr - Enable
{$Optimization noPEEPHOLE}in LazListClasses to get working code