Encoding.GetBytes doesn't work correctly on one of the overloads
## Summary The method: function TEncoding.GetBytes(const S: UnicodeString; CharIndex, CharCount: Integer; const Bytes: TBytes; ByteIndex: Integer): Integer; in file sysencoding.inc checks the parameters incorrectly. It is not possible to correctly convert a full string to bytes with this method. ## System Information <!-- The more information are provided the easier it is to replicate the bug --> - **Operating system:** Tested in Windows, Mac (M1). But it should affect any OS - **Processor architecture:** Tested in x86-64, ARM. But should affect any Arch. - **Compiler version:** <!-- trunk. 41dbedfe2275bb536a9eff35bb1054ce1f64b2b3 --> - **Device:** Computer ## Steps to reproduce Run the following program: ```pascal var data: UnicodeString; pdata: TBytes; begin setlength(pdata, 200); data := 'MyTest'; TEncoding.Unicode.GetBytes(Data, 1, Length(Data), pData, 0); ``` ## What is the current bug behavior? Crashes with Exception: Project project1 raised exception class '0000000eEEncodingError' with message: invalid count [6] ## What is the expected (correct) behavior? Should work and return the encoded data in pData ## Possible fixes The current method assumes a 0-based charindex in some places, and 1-based in others. It looks like copied from the overload ```TEncoding.GetBytes(const Chars: TUnicodeCharArray``` where Chars starts at 0. But strings start at 1. The current code is: ```pascal function TEncoding.GetBytes(const S: UnicodeString; CharIndex, CharCount: Integer; const Bytes: TBytes; ByteIndex: Integer): Integer; var ByteLen: Integer; begin ByteLen := Length(Bytes); if (ByteLen = 0) and (CharCount > 0) then raise EEncodingError.Create(SInvalidDestinationArray); if (ByteIndex < 0) or (ByteLen < ByteIndex) then raise EEncodingError.CreateFmt(SInvalidDestinationIndex, [ByteIndex]); if (CharCount < 0) or (Length(S) < CharCount + CharIndex) then raise EEncodingError.CreateFmt(SInvalidCount, [CharCount]); if (CharIndex < 1) then raise EEncodingError.CreateFmt(SCharacterIndexOutOfBounds, [CharIndex]); Result := GetBytes(@S[CharIndex], CharCount, @Bytes[ByteIndex], ByteLen - ByteIndex); end; ``` I suggest it should be: ```pascal function TEncoding.GetBytes(const S: UnicodeString; CharIndex, CharCount: Integer; const Bytes: TBytes; ByteIndex: Integer): Integer; var ByteLen: Integer; begin ByteLen := Length(Bytes); if (ByteLen = 0) and (CharCount > 0) then raise EEncodingError.Create(SInvalidDestinationArray); if (ByteIndex < 0) or (ByteLen < ByteIndex) then raise EEncodingError.CreateFmt(SInvalidDestinationIndex, [ByteIndex]); if (CharIndex < 1) then raise EEncodingError.CreateFmt(SCharacterIndexOutOfBounds, [CharIndex]); if (CharCount < 0) or (Length(S) < CharCount + CharIndex - 1) then raise EEncodingError.CreateFmt(SInvalidCount, [CharCount]); Result := GetBytes(@S[CharIndex], CharCount, @Bytes[ByteIndex], ByteLen - ByteIndex); end; ``` I moved the check for CharIndex < 1 before the check for Length(s) < CharCount + CharIndex and added a "-1", like: Length(s) < CharCount + CharIndex - 1 The only really needed change is to check for CharIndex - 1 instead of charIndex (since CharIndex starts at 1, and if it didnt the next line GetBytes(@S[CharIndex] will fail). But to avoid checking negative numbers in case CharIndex = 0, I moved the test for CharIndex < 1 above.
issue