Skip to content

[AVR] optimization LdsMov2Lds

ccrause requested to merge ccrause/fpc-source:2023-11-19 into main

Currently the compiler does not optimize the instruction sequence for a call to a subroutine variable. Consider the following example:

program test;

var
  p: procedure(b: byte);

procedure pb(b: byte); assembler; nostackframe;
asm
end;

begin
  p := @pb;
  p($01);
end.

Compiling with ppcrossavr -Tembedded -Wpatmega328p -O1 test.lpr generates the following instructions:

# [12] p($01);
	ldi	r24,1
	lds	r18,(U_sPsTEST_ss_P)
	lds	r2,(U_sPsTEST_ss_P+1)
	mov	r30,r18
	mov	r31,r2
	icall  

Note that the value of variable p is first loaded into registers r18, r2 and then moved to the final Z register pair.

This peephole optimization identifies this situation and shortens the instructions to:

# [12] p($01);
	ldi	r24,1
	lds	r30,(U_sPsTEST_ss_P)
	lds	r31,(U_sPsTEST_ss_P+1)
	icall

Note: to identify the relationship between lds r18,(U_sPsTEST_ss_P) and mov r30,r18 which is separated by the instruction lds r2,(U_sPsTEST_ss_P+1) requires relaxing the limitation of GetNextInstructionUsingReg to only check one instruction ahead unless -O3 is active. This also leads to more optimizations identified for other situations.

Merge request reports