Encoding.GetBytes doesn't work correctly on one of the overloads

Summary

The method: function TEncoding.GetBytes(const S: UnicodeString; CharIndex, CharCount: Integer; const Bytes: TBytes; ByteIndex: Integer): Integer;

in file sysencoding.inc checks the parameters incorrectly. It is not possible to correctly convert a full string to bytes with this method.

System Information

  • Operating system: Tested in Windows, Mac (M1). But it should affect any OS
  • Processor architecture: Tested in x86-64, ARM. But should affect any Arch.
  • Compiler version:
  • Device: Computer

Steps to reproduce

Run the following program:

  var
   data: UnicodeString;
    pdata: TBytes;
  begin
    setlength(pdata, 200);
    data := 'MyTest';
    TEncoding.Unicode.GetBytes(Data,  1, Length(Data), pData, 0);

What is the current bug behavior?

Crashes with Exception:

Project project1 raised exception class '0000000eEEncodingError' with message: invalid count [6]

What is the expected (correct) behavior?

Should work and return the encoded data in pData

Possible fixes

The current method assumes a 0-based charindex in some places, and 1-based in others. It looks like copied from the overload TEncoding.GetBytes(const Chars: TUnicodeCharArray where Chars starts at 0. But strings start at 1.

The current code is:

function TEncoding.GetBytes(const S: UnicodeString; CharIndex, CharCount: Integer;
  const Bytes: TBytes; ByteIndex: Integer): Integer;
var
  ByteLen: Integer;
begin
  ByteLen := Length(Bytes);
  if (ByteLen = 0) and (CharCount > 0) then
    raise EEncodingError.Create(SInvalidDestinationArray);
  if (ByteIndex < 0) or (ByteLen < ByteIndex) then
    raise EEncodingError.CreateFmt(SInvalidDestinationIndex, [ByteIndex]);
  if (CharCount < 0) or (Length(S) < CharCount + CharIndex) then
    raise EEncodingError.CreateFmt(SInvalidCount, [CharCount]);
  if (CharIndex < 1) then
    raise EEncodingError.CreateFmt(SCharacterIndexOutOfBounds, [CharIndex]);
  Result := GetBytes(@S[CharIndex], CharCount, @Bytes[ByteIndex], ByteLen - ByteIndex);
end;

I suggest it should be:

function TEncoding.GetBytes(const S: UnicodeString; CharIndex, CharCount: Integer;
  const Bytes: TBytes; ByteIndex: Integer): Integer;
var
  ByteLen: Integer;
begin
  ByteLen := Length(Bytes);
  if (ByteLen = 0) and (CharCount > 0) then
    raise EEncodingError.Create(SInvalidDestinationArray);
  if (ByteIndex < 0) or (ByteLen < ByteIndex) then
    raise EEncodingError.CreateFmt(SInvalidDestinationIndex, [ByteIndex]);
  if (CharIndex < 1) then
    raise EEncodingError.CreateFmt(SCharacterIndexOutOfBounds, [CharIndex]);
  if (CharCount < 0) or (Length(S) < CharCount + CharIndex - 1) then
    raise EEncodingError.CreateFmt(SInvalidCount, [CharCount]);
  Result := GetBytes(@S[CharIndex], CharCount, @Bytes[ByteIndex], ByteLen - ByteIndex);
end;

I moved the check for CharIndex < 1 before the check for Length(s) < CharCount + CharIndex and added a "-1", like: Length(s) < CharCount + CharIndex - 1

The only really needed change is to check for CharIndex - 1 instead of charIndex (since CharIndex starts at 1, and if it didnt the next line GetBytes(@S[CharIndex] will fail). But to avoid checking negative numbers in case CharIndex = 0, I moved the test for CharIndex < 1 above.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information