Optimizing private values

Summary

In this code, Test1 is a simple task and the compiler moves I to the register. But in Test2, it seems it does not because it thinks I may change, or it first optimizes each function and then inline them not inlining the optimizing.

System Information

Lazarus 2.3.0 (rev main-2_3-2341-g686b056fcf) FPC 3.3.1 x86_64-win64-win32/win64

Example Project

program project1;

{$MODE Delphi}

uses
  SysUtils;

  procedure Test1;
  var
    I: Int64;
    A: array of Int64;
    T: QWord;
  begin
    SetLength(A, 100 * 1000 * 1000);
    I := 0;

    T := GetTickCount64;
    while I < Length(A) do
    begin
      if A[I] = 1 then
        Break;
      I += 1;
    end;
    WriteLn(GetTickCount64 - T);
  end;

type
  TRecord = record
  strict private
    I: Int64;
    A: array of Int64;
  public
    procedure Init;
    procedure Next; inline;
    function IsEnd: Boolean; inline;
    function Current: Int64; inline;
  end;

  procedure TRecord.Init;
  begin
    SetLength(A, 100 * 1000 * 1000);
    I := 0;
  end;

  procedure TRecord.Next;
  begin
    Inc(I);
  end;

  function TRecord.IsEnd: Boolean;
  begin
    Result := I >= Length(A);
  end;

  function TRecord.Current: Int64;
  begin
    Result := A[I];
  end;

  procedure Test2;
  var
    R: TRecord;
    T: QWord;
  begin
    R.Init;
    T := GetTickCount64;
    while not R.IsEnd do
    begin
      if R.Current = 1 then
        Break;
      R.Next;
    end;
    WriteLn(GetTickCount64 - T);
  end;

begin
  Test1;//62ms
  Test2;//187ms
  ReadLn;
end.

What is the current bug behavior?

The compiler does not optimizes the I in the record even if it is not touched by any other code.

What is the expected (correct) behavior?

Move the I to the register.

Relevant issue

It is similar to #39167 and #39725 but I don't know if these are one issue or not.

Background

I need these behaviors as I like to have abstract records that can handle their internal data. For example, using them like a TStream without calling Read every time. In this way I can have a fast code, browsing the memory, and when I reach the end, the source can be updated (A in this sample) and the loop can continue as usual.