Encoding.GetBytes doesn't work correctly on one of the overloads
Summary
The method: function TEncoding.GetBytes(const S: UnicodeString; CharIndex, CharCount: Integer; const Bytes: TBytes; ByteIndex: Integer): Integer;
in file sysencoding.inc checks the parameters incorrectly. It is not possible to correctly convert a full string to bytes with this method.
System Information
- Operating system: Tested in Windows, Mac (M1). But it should affect any OS
- Processor architecture: Tested in x86-64, ARM. But should affect any Arch.
- Compiler version:
- Device: Computer
Steps to reproduce
Run the following program:
var
data: UnicodeString;
pdata: TBytes;
begin
setlength(pdata, 200);
data := 'MyTest';
TEncoding.Unicode.GetBytes(Data, 1, Length(Data), pData, 0);
What is the current bug behavior?
Crashes with Exception:
Project project1 raised exception class '0000000eEEncodingError' with message: invalid count [6]
What is the expected (correct) behavior?
Should work and return the encoded data in pData
Possible fixes
The current method assumes a 0-based charindex in some places, and 1-based in others. It looks like copied from the overload TEncoding.GetBytes(const Chars: TUnicodeCharArray
where Chars starts at 0. But strings start at 1.
The current code is:
function TEncoding.GetBytes(const S: UnicodeString; CharIndex, CharCount: Integer;
const Bytes: TBytes; ByteIndex: Integer): Integer;
var
ByteLen: Integer;
begin
ByteLen := Length(Bytes);
if (ByteLen = 0) and (CharCount > 0) then
raise EEncodingError.Create(SInvalidDestinationArray);
if (ByteIndex < 0) or (ByteLen < ByteIndex) then
raise EEncodingError.CreateFmt(SInvalidDestinationIndex, [ByteIndex]);
if (CharCount < 0) or (Length(S) < CharCount + CharIndex) then
raise EEncodingError.CreateFmt(SInvalidCount, [CharCount]);
if (CharIndex < 1) then
raise EEncodingError.CreateFmt(SCharacterIndexOutOfBounds, [CharIndex]);
Result := GetBytes(@S[CharIndex], CharCount, @Bytes[ByteIndex], ByteLen - ByteIndex);
end;
I suggest it should be:
function TEncoding.GetBytes(const S: UnicodeString; CharIndex, CharCount: Integer;
const Bytes: TBytes; ByteIndex: Integer): Integer;
var
ByteLen: Integer;
begin
ByteLen := Length(Bytes);
if (ByteLen = 0) and (CharCount > 0) then
raise EEncodingError.Create(SInvalidDestinationArray);
if (ByteIndex < 0) or (ByteLen < ByteIndex) then
raise EEncodingError.CreateFmt(SInvalidDestinationIndex, [ByteIndex]);
if (CharIndex < 1) then
raise EEncodingError.CreateFmt(SCharacterIndexOutOfBounds, [CharIndex]);
if (CharCount < 0) or (Length(S) < CharCount + CharIndex - 1) then
raise EEncodingError.CreateFmt(SInvalidCount, [CharCount]);
Result := GetBytes(@S[CharIndex], CharCount, @Bytes[ByteIndex], ByteLen - ByteIndex);
end;
I moved the check for CharIndex < 1 before the check for Length(s) < CharCount + CharIndex and added a "-1", like: Length(s) < CharCount + CharIndex - 1
The only really needed change is to check for CharIndex - 1 instead of charIndex (since CharIndex starts at 1, and if it didnt the next line GetBytes(@S[CharIndex] will fail). But to avoid checking negative numbers in case CharIndex = 0, I moved the test for CharIndex < 1 above.