Skip to content

[YottaDB/DBMS/YDBOcto#205] Move 1Mb size xecute_buffer variable from stack to heap (works around SIG-11 with valgrind)

Narayanan Iyer requested to merge nars1/YDB:valgrind into master

Background

  • See YottaDB/DBMS/YDBOcto!1054 (comment 874182919) and preceding discussion for more details.

  • While testing YDBOcto with valgrind, we encountered test failures with the following symptom.

    ==20659== Can't extend stack to 0x1ffeeb8f58 during signal delivery for thread 1:
    ==20659==   too small or bad protection modes
    ==20659==
    ==20659== Process terminating with default action of signal 11 (SIGSEGV): dumping core
    ==20659==  Access not within mapped region at address 0x1FFEEB8F58
    ==20659==    at 0x4849FD8: strncpy (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
    ==20659==    by 0x489883B: UnknownInlinedFun (string_fortified.h:95)
    ==20659==    by 0x489883B: cli_present (cli_parse.c:889)
    ==20659==    by 0x4897533: cli_get_str (cli.c:283)
    ==20659==    by 0x4BBA0F3: trigger_parse (trigger_parse.c:1426)
    ==20659==    by 0x4B046B7: trigger_update_rec (trigger_update.c:1386)
    ==20659==    by 0x4BD7456: trigger_update_rec_helper.constprop.0 (trigger_update.c:2171)
    ==20659==    by 0x4B7990A: UnknownInlinedFun (trigger_update.c:2224)
    ==20659==    by 0x4B7990A: op_fnztrigger (op_fnztrigger.c:248)
    ==20659==    by 0x80517B4: _ydboctoplanhelpers (in YDBOcto/build/src/utf8/_ydbocto.so)
    ==20659==    by 0x9D06DEF: ???
    ==20659==    by 0x1FFEFFB0CF: ???
    ==20659==    by 0x1FFEFFB105: ???
    ==20659==    by 0x1FFEFFB113: ???
    ==20659==  If you believe this happened as a result of a stack
    ==20659==  overflow in your program's main thread (unlikely but
    ==20659==  possible), you can try to increase the size of the
    ==20659==  main thread stack using the --main-stacksize= flag.
    ==20659==  The main thread stack size used in this run was 268435456.
  • I believe we only use around 1.25Mb or so in the stack so not sure why a 16Mb stack size was not enough. I posted a question at https://sourceforge.net/p/valgrind/mailman/message/37625144 to see if this is a valgrind issue. Below is some text from that.

    Originally I got a failure with the --main-stacksize set to 16Mb so I bumped it to 256Mb. And I still keep getting this failure at different tests. I also set the ulimit for stacksize to 256Mb just in case and I still see the failures.

Work around

  • In general, it is not a good idea to allocate huge variables in the stack. Of the 1.25Mb or so stack space that I expect us to be using in the above stack trace, 1Mb of space is occupied by the xecute_buffer variable in the trigger_update_rec() function in sr_unix/trigger_update.c.

  • Therefore, this commit moves that variable from the stack to the heap to see if that helps avoid valgrind failures like the above. And it did. Not sure why.

Changes

  • We now do a malloc() and free() of the xecute_buffer variable. And since it is no longer an array of characters, the SIZEOF() operator cannot be used. That is now replaced with the macro expression that represents the allocated array size.

Merge request reports

Loading