FFI and non-ASCII characters
This is three related issues, discovered while developing some handwritten FFI bindings.
base-char corruption across FFI boundary with :cstring
I have this straight-forward binding:
(ffi:def-function ("_DrawText" draw-text-raw)
((text :cstring)
(pos-x :int)
(pos-y :int)
(font-size :int)
(color (* color-raw)))
:returning :void)
(declaim (ftype (function (string fixnum fixnum fixnum color) null) draw-text))
(defun draw-text (text pos-x pos-y font-size color)
(draw-text-raw text pos-x pos-y font-size (color-pointer color)))
Which accesses the following C:
void _DrawText(const char *text, int posX, int posY, int fontSize,
Color *color) {
Color stack = *color;
DrawText(text, posX, posY, fontSize, stack);
}
where the inner DrawText comes from Raylib. Under ECL, passing the string Café corrupts the é; if I print it here to standard out, it shows:
Caf\351
If I print out each character and its associated int code separately, we see:
C: 67
a: 97
f: 102
\351: -23
\0: 0
Note that the char-code of #\é is 233. Thus -23 seems suspiciously like what you'd get if there had been some overflow of an 8-bit signed char somewhere. If you instead create a string literal in C of "Café" and print its contents in the same way, you see:
C: 67
a: 97
f: 102
\303: -61
\251: -87
\0: 0
So it seems that using :cstring only handles characters that are also standard-char and thus fit in an 8-bit char. If this is intended, then at least it should be documented.
Condition signalled when passing CJK characters
In debugging all this, I did an entire review of CL's character and string types.
For ECL we see that #\a, #\é, and #\涅 live in separate character categories. If in the previous code above we passed 涅槃 instead of Café, an interesting Condition is signalled:
Cannot coerce string 涅槃 to a base-string
This implies that the FFI logic at least expects a string of base-char, although we've already demonstrated that true base-char values get corrupted.
Potentially distantly related: https://github.com/clasp-developers/clasp/issues/1595
ffi:with-foreign-string oddly truncates
A related set of bindings:
(ffi:def-function ("_DrawTextEx" draw-text-ex-raw)
((font (* font-raw))
(text (* :char))
(position (* vector2-raw))
(font-size :float)
(spacing :float)
(tint (* color-raw)))
:returning :void)
(declaim (ftype (function (font string vector2 real real color)) draw-text-ex))
(defun draw-text-ex (font text position font-size spacing tint)
"Draw text using a `font' and additional parameters."
(ffi:with-foreign-string (ctext text)
(draw-text-ex-raw (font-pointer font)
ctext
(vector2-pointer position)
(float font-size)
(float spacing)
(color-pointer tint))))
Here we see my attempt to use (* :char) instead, and manually convert the Lisp string to this via ffi:with-foreign-string. However, on the C side, it only receives:
C: 67
\0: 0
It seems that it only respects the first character of the original string. This occurs whether the original string was all standard-char or not.
Here's the relevant function within ECL:
(defun convert-to-foreign-string (string-designator)
"Syntax: (convert-to-foreign-string string-designator)
Converts a Lisp string to a foreign string. Memory should be freed
with free-foreign-object."
(let ((lisp-string (string string-designator))
(foreign-type '(* :char)))
(c-inline (lisp-string foreign-type) (t t) t
"{
cl_object lisp_string = #0;
cl_index size = lisp_string->base_string.fillp;
cl_object output = ecl_allocate_foreign_data(#1, size+1);
memcpy(output->foreign.data, lisp_string->base_string.self, size);
output->foreign.data[size] = '\\0';
@(return) = output;
}"
:one-liner nil
:side-effects t)
))
Looking at the embedded C, it might be a mistake to assume that the base_string.fillp (the fill pointer, I'm assuming) would help you if the string had been a non-adjustable simple-string with a NIL fill pointer. Note that this particular line was written in 2006! Quite an old bug
Thank you for reading this report and for any guidance you can offer.
VERSION "24.5.10"
VCS-ID "c0720610ddd12d709508c99f25ac56bf8c73e0a2"
OS "Linux"
OS-VERSION "6.15.6-arch1-1"
MACHINE-TYPE "x86_64"
FEATURES (:SLYNK :SERVE-EVENT :ASDF-PACKAGE-SYSTEM :ASDF3.1 :ASDF3 :ASDF2
:ASDF :OS-UNIX :NON-BASE-CHARS-EXIST-P :ASDF-UNICODE :WALKER
:CDR-6 :GRAY-STREAMS-MODULE :CDR-1 :CDR-5 :LINUX :FORMATTER
:CDR-7 :ECL-WEAK-HASH :LITTLE-ENDIAN :ECL-READ-WRITE-LOCK
:LONG-LONG :UINT64-T :UINT32-T :UINT16-T :COMPLEX-FLOAT
:LONG-FLOAT :UNICODE :CLOS-STREAMS :CMU-FORMAT :UNIX :ECL-PDE
:DLOPEN :CLOS :THREADS :BOEHM-GC :ANSI-CL :COMMON-LISP
:FLOATING-POINT-EXCEPTIONS :IEEE-FLOATING-POINT
:PACKAGE-LOCAL-NICKNAMES :CDR-14 :PREFIXED-API :FFI :X86_64
:COMMON :ECL)
