Inefficient register allocation leads to redundant moves
The register allocator lacks some smarts. This shows the problem:
> (disassemble fx+)
Disassembly for #<procedure fx+ .akku/lib/loko/libs/fixnums.loko.sls:3875>
entry:
2024F0 83F8F0 (cmp eax #xFFFFFFF0)
2024F3 0F858FEBFFFF (jnz (+ rip #x-1471))
2024F9 488BCE (mov rcx rsi)
2024FC 488BD3 (mov rdx rbx)
2024FF 488BF5 (mov rsi rbp)
202502 488BC7 (mov rax rdi)
L3:
202505 488BF9 (mov rdi rcx)
202508 480BF8 (or rdi rax)
; (unless (fixnum? dil) (goto L0))
20250B 40F6C707 (test dil #x7)
20250F 0F851A000000 (jnz L0)
L2:
202515 4803C8 (add rcx rax)
202518 0F800A000000 (jo L1)
20251E 488BC1 (mov rax rcx)
202521 488BDA (mov rbx rdx)
202524 488BEE (mov rbp rsi)
202527 C3 (ret)
L1:
; (raise (make-assertion-violation))
202528 0F0B (ud2)
20252A E9E6FFFFFF (jmp L2)
L0:
; (raise (make-assertion-violation))
20252F 0F0B (ud2)
202531 E9CFFFFFFF (jmp L3)
The code that moves rbx, rbp, etc is completely redundant. The pseudo registers involved should have been assigned the same physical register.
The problem is probably the coalescing. Read up on it in Briggs' thesis.
Edited by Gwen Weinholt