Skip to content

cwstring: string conversions might fail

Summary

Sometimes cwstring fails to convert between AnsiString and UnicodeString and vice versa

System Information

  • Operating system: macOS Ventura 13.6.1
  • Processor architecture: x86-64
  • Compiler version: trunk

Steps to reproduce

Running the example project will sometimes fail - for me it failed regularly when run under debugger from Lazarus IDE, but always worked fine when run from terminal.

See also forum thread.

Example Project

program TestSetCodePage;
 
{$mode objfpc}{$H+}
 
uses
  {$IFDEF UNIX}
  cthreads,
  cwstring,
  {$ENDIF}
  Classes,
  SysUtils;
 
const
  TextToWrite: string = 'äöüß';
var
  rbs: RawByteString;
  i: Integer;
begin
  rbs:= TextToWrite;
  WriteLn('Original CodePage = ', StringCodePage(rbs), ' Length = ', Length(rbs));
  for i:= 1 to Length(rbs) do
    Write(IntToHex(Ord(rbs[i])), ' ');
  WriteLn();
  if Length(UnicodeString(rbs)) <> 4 then
    Halt(1);
  SetCodePage(rbs, 1252);
  WriteLn('Converted CodePage = ', StringCodePage(rbs), ' Length = ', Length(rbs));
  for i:= 1 to Length(rbs) do
    Write(IntToHex(Ord(rbs[i])), ' ');
  WriteLn();
  if Length(rbs) <> 4 then
    Halt(1);
end.

What is the current bug behavior?

Converting rbs to UnicodeString returns a string with Length() = 8.

The bug was introduced with commit bf3ced76 by @mvancanneyt.

Before RawByteStrings were used and cast to PAnsiChar when calling iconv_open. This ensured the strings were null-terminated.

Now ShortStrings are used the and the cast to PAnsiChar is applied to the pointer of the first string element. This no longer ensures null-terminated strings so iconv_open might fail and return -1, if the string happens to be not null-terminated by chance.

What is the expected (correct) behavior?

Converting rbs to UnicodeString returns a string with Length() = 4.

Possible fixes

Either return to using RawByteString or manually append #0 to both strings. The attached patch does the latter fpcsrc-16-32-25.patch

Edited by modersohn
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information