[AVR] optimization LdsMov2Lds
Currently the compiler does not optimize the instruction sequence for a call to a subroutine variable. Consider the following example:
program test;
var
p: procedure(b: byte);
procedure pb(b: byte); assembler; nostackframe;
asm
end;
begin
p := @pb;
p($01);
end.
Compiling with ppcrossavr -Tembedded -Wpatmega328p -O1 test.lpr
generates the following instructions:
# [12] p($01);
ldi r24,1
lds r18,(U_sPsTEST_ss_P)
lds r2,(U_sPsTEST_ss_P+1)
mov r30,r18
mov r31,r2
icall
Note that the value of variable p is first loaded into registers r18, r2 and then moved to the final Z register pair.
This peephole optimization identifies this situation and shortens the instructions to:
# [12] p($01);
ldi r24,1
lds r30,(U_sPsTEST_ss_P)
lds r31,(U_sPsTEST_ss_P+1)
icall
Note: to identify the relationship between lds r18,(U_sPsTEST_ss_P)
and mov r30,r18
which is separated by the instruction lds r2,(U_sPsTEST_ss_P+1)
requires relaxing the limitation of GetNextInstructionUsingReg to only check one instruction ahead unless -O3 is active. This also leads to more optimizations identified for other situations.