Skip to content

varargs undefined behavior bug

In the C codebase, there are many parts of ECL that run into undefined behavior, which manifests as a bug on Apple Silicon, in the casting of varargs functions. The basic problem is most simply illustrated by the following example program:

#include <stdio.h>

void testfun(int z, int* x1, ...) {
    printf("z = %i\nx1 = %i\n", z, *x1);
}

int main () {
    void (*f)(int z, ...) = &test1;
    int x1 = 3;
    f(9, &x1);
}

When compiled with clang on the new "Apple Silicon"/M1 chip, this prints a random number for x1, whereas if we compile for x86_64, this code prints 3, as expected.

In ECL, there are many examples of this problem because of (1) the definition

typedef cl_object (*cl_objectfn)(cl_narg narg, ...);

and (2) the fact that many objects treated as type cl_objectfn actually have a more specific signature. For example, consider string_equal in src/c/string.d, which gets compiled to c_string_equal with signature

cl_object cl_string_equal(cl_narg narg, cl_object string1, cl_object string2, ...)

Indeed, the use of this function blocks the compilation of ECL on the Apple Silicon platform: ecl_min segfaults unless the line here in test_compare is changed to

  return ((cl_object (*)(cl_narg narg, cl_object obj1, cl_object obj2, ...))t->test_fn)(2, t->item_compared, x) != ECL_NIL;

although there are other failures in ECL associated to this bug (and of course this cannot be the fix, since in general t->test_fn may actually be of type cl_objectfn).

I am happy to try and fix this, but is there some recommended approach? It appears one reasonable approach is to change all functions that are used as objectfn to have the correct signature (and update the argument parsing in those cases) but if there is another suggestion please let me know.

Edited by Yuri Lensky