Crash with 3.3.1: Bad code generation by optimizer: reg-var + peephole / Intel 64bit - using uninitialized register as source

Summary

The optimizer produces bad asm code in the user app.

Using registers for local vars, the peephole swaps some registers, but later uses the original register which then is not initialized.

System Information

  • Operating system: Windows 10 / 64 bit
  • Processor architecture: x86, Intel 64 bit
  • Compiler version: 3.3.1 d3ac07ad

Steps to reproduce

Compile the example.
Look at the asm code for procedure TLazFixedRoundBufferListMemBase.MoveBytesToOther(AFromByteOffset, AToByteOffset, AByteCount, AByteCap: Integer; AnOther: PLazFixedRoundBufferListMemBase);

Must be -O3 or -O4

Register r10 is used (value read from r10) without ever having been assigned

r10 does not get used/initialized in that code, but in

# Var OtherMPtr located in register r15
# Var PTarget located in register rdx
	# Register r10 released
# [1728] DstHigh := PTarget + AByteCap;
	cltq
	leaq	16(%rax,%r10),%rdi

It is used as value for OtherMPtr.

  • rax is AByteCap`
  • The line is merged with PTarget := @OtherMPtr^.Data; and Data is at `@(OtherMPtr+16)^

Comparing asm (snippets) with/without peephole

Block 1 - At the start (getting the value for OtherMPtr)
  • Good code with peephole off
	# Register r9 allocated
# [1720] OtherMPtr := AnOther^.FMem.FMem;
	movq	56(%rbp),%r9
# Var AnOther located in register r9
# Var OtherMPtr located in register r10
	# Register r10 allocated
	movq	(%r9),%r10
	# Register r9 released
  • Bad code with peephole on
# [1720] OtherMPtr := AnOther^.FMem.FMem;
	movq	56(%rbp),%r9
# Var AnOther located in register r9
# Var OtherMPtr located in register r10
	# Register r10,r15,r15 allocated
	movq	(%r9),%r15
	# Register r9 released
  • Note:
    • With peephole OtherMPtr will be in r15
    • Var OtherMPtr located in register r10 is wrong
    • # Register r10,r15,r15 allocated r10 is still marked allocated / r15 is marked allocated twice
Block 2 - The code that is wrong above

Below is the code without peephole. This is the correct version of the first asm snippet (with the wrong non initialized r10 usage).

  • Here r10 has the value of OtherMPtr
  • PTarget is correctly loaded (the asm for that line is not merged with the next line)
  • the faulty line then uses PTarget in rdx
	# Register r15 allocated
# [1727] PTarget := @OtherMPtr^.Data;
	movq	%r10,%r15
	# Register r10 released
# Var OtherMPtr located in register r15
	# Register rdx allocated
	leaq	16(%r15),%rdx
# Var PTarget located in register rdx
# [1728] DstHigh := PTarget + AByteCap;
	movslq	%eax,%rax
	leaq	(%rax,%rdx),%rax

Note that without the peephole, OtherMPtr is in r10. But gets moved to r15 before PTarget is read.

  • With the peephole, OtherMPtr will go straight into r15 in "block 1" (so that is good, saves the register move)
  • But the peephole then apparently (I guess) gets it wrong when combining the asm from the 2 lines.
    • Instead of using r15 it uses r10 ignoring that it already changed that register.

Example Project

Attached crash_opt_reg.zip

  • Compile with fpc.exe -O3 -alr project1.lpr
  • Enable {$Optimization noPEEPHOLE} in LazListClasses to get working code
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information