Skip to content

[#869] Fix SIG-11 (instead of NUMOFLOW error) in boolean expressions when huge numerics (e.g. 1E47) are used

Narayanan Iyer requested to merge nars1/YDB:ydb869 into master

Background

  • Below is pasted from #869 (closed)

  • This is an issue that was found by fuzz testing. Various expressions end up in SIG-11 in Release builds and Assert failures in Debug builds. This happens only in YottaDB (and from r1.24 release onwards). Not in GT.M.

  • In GT.M though, I expect to see a NUMOFLOW error but see the expression evaluated fine which makes me think there is a correctness issue there. Not sure yet.

  • Below are the failure details.

  1. Original test case that failed during fuzz testing

    Debug build

    YDB>write $select(mmfncd'='"77777777777777777777777777777777777777777777777:("rite>"_mmfncd),1:"no got it")
    %YDB-E-COLON, Colon (:) expected in this context
            write $select(mmfncd'='"77777777777777777777777777777777777777777777777:("rite>"_mmfncd),1:"no got it")
                                                                                      ^-----
    YDB>write $select(mmfncd'='"777777777777777777777777777777777777777777777777:("rite>"_mmfncd),1:"no got it")
    %YDB-F-ASSERT, Assert failed in sr_port/ex_tail.c line 82 for expression ((OC_COMVAL == t2->opcode) || (OC_COMINT == t2->opcode))

    Release build

    YDB>write $select(mmfncd'='"77777777777777777777777777777777777777777777777:("rite>"_mmfncd),1:"no got it")
    %YDB-E-COLON, Colon (:) expected in this context
            write $select(mmfncd'='"77777777777777777777777777777777777777777777777:("rite>"_mmfncd),1:"no got it")
                                                                                      ^-----
    YDB>write $select(mmfncd'='"777777777777777777777777777777777777777777777777:("rite>"_mmfncd),1:"no got it")
    %YDB-F-KILLBYSIGSINFO1, YottaDB process 6822 has been killed by a signal 11 at address 0x00007F0C4E7502F5 (vaddr 0x0000000000000060)
    %YDB-F-SIGMAPERR, Signal was caused by an address not mapped to an object
  2. Simpler test case that shows the same issue

    Debug build

    YDB>write x='"1E47"
    %YDB-F-ASSERT, Assert failed in sr_port/ex_tail.c line 82 for expression ((OC_COMVAL == t2->opcode) || (OC_COMINT == t2->opcode))

    Release build

    YDB>write x='"1E47"
    %YDB-F-KILLBYSIGSINFO1, YottaDB process 82048 has been killed by a signal 11 at address 0x00007FF7D3FA54AC (vaddr 0x0000000000000060)
    %YDB-F-SIGMAPERR, Signal was caused by an address not mapped to an object
  3. Another test case with different symptom. This fails only in Debug builds. Runs fine in Release builds.

    Debug build

    YDB>write 0='"1E47"
    %YDB-F-ASSERT, Assert failed in sr_port/bx_boollit.c line 206 for expression ((OC_NEG != ref0->opcode) && (OC_FORCENUM != ref0->opcode) && (OC_COM != ref0->opcode))
  4. A few more test cases that failed.

    Debug build

    YDB>set x=(1!$s($$^truZQUIT='"1E47":0,1:1))
    %YDB-F-ASSERT, Assert failed in sr_port/ex_tail.c line 160 for expression ((OC_COMVAL == t2->opcode) || (OC_COMINT == t2->opcode))
    
    YDB>set (z,$iv)=(1!$s($$^truZQUIT='"1E47":"ok",1:1))
    %YDB-F-ASSERT, Assert failed in sr_port/ex_tail.c line 82 for expression ((OC_COMVAL == t2->opcode) || (OC_COMINT == t2->opcode))
  5. More tests that fail

    Debug build

    YDB>i i>1,$zbitfind(b2,'+"1E47"=i w !,"oops"
    %YDB-F-ASSERT, Assert failed in sr_port/ex_tail.c line 82 for expression ((OC_COMVAL == t2->opcode) || (OC_COMINT == t2->opcode))
    
    YDB>w '+"1E47"=1
    %YDB-F-ASSERT, Assert failed in sr_port/bx_boollit.c line 206 for expression ((OC_NEG != ref0->opcode) && (OC_FORCENUM != ref0->opcode) && (OC_COM != ref0->opcode))
  • Test cases (1), (2), (3) and (4) started failing from YottaDB r1.24 only. That is, they worked fine in YottaDB r1.22.

  • I suspect the cause of the regression is the below commit.

    • 26173873; 2018-10-03 15:42:47 -0400; Narayanan Iyer; [#363 (closed)] Compiler returns incorrect results in cases where it should issue NUMOFLOW
  • Test case (5) though fails even in YottaDB r1.22. But only in Debug builds. Below is the output in that release.

    Debug build

    YDB>i i>1,$zbitfind(b2,'+"1E47"=i w !,"oops"
    %YDB-F-ASSERT, Assert failed in sr_port/coerce.c line 47 for expression (MV_NM & ref->operand[0].oprval.tref->operand[0].oprval.mlit->v.mvtype)

    Release build

    YDB>i i>1,$zbitfind(b2,'+"1E47"=i w !,"oops"
    %YDB-E-RPARENMISSING, Right parenthesis expected
            i i>1,$zbitfind(b2,'+"1E47"=i w !,"oops"
                                         ^-----

Issue

  • In case of an expression like write x='"1E47", there would be an OC_LIT triple corresponding to the literal "1E47". There would then be an OC_COM triple to implement the complement operator (') on the OC_LIT triple. And there would then be a OC_COMVAL triple to convert the boolean result into an mval for later use in the OC_EQU triple (that implements the = operator).

  • The "1E47" literal cannot be computed at compile time as it has a NUMOFLOW error which will be issued at runtime. Therefore, literal optimization should not continue if it encounters literals that would end up with a NUMOFLOW error. But this was not the case in the code.

  • So the literal optimization code in sr_port/bx_boollit.c incorrectly assumed the '"1E47" computation can be done at compile time and removed the unnecessary OC_COMVAL triple (using a dqdel()) which later confused the logic in ex_tail.c and failed an assert where it was expecting a OC_COMVAL or OC_COMINT triple.

Fix

  • In sr_port/bx_boollit.c, we now check if a literal is a number that evaluates to a NUMOFLOW error and if the unary operation that is being performed on it is a OC_COM (unary ' operator), OC_NEG (unary - operator) or OC_FORCENUM (unary + operator). If so, we return after issuing a NUMOFLOW error as literal optimization cannot be performed. This avoids the dqdel() (which happens later in the same function) from happening thereby avoiding the assert failure.

  • Because of the newly added check (in the previous bullet), a pre-existing similar check a few lines later (right after a s2n() call) is now replaced with an assert since we would never reach here in case of a NUMOFLOW error (we would have returned from the newly added check a few lines above).

  • An unrelated change to sr_port/bx_boollit.c was to remove the line neg = num = 0; as I did not see it do anything (neg and num are set to a value later anyways). This fixed 3 warnings from clang-tidy and so the following reference files were updated to remove those 3 warning lines.

    • ci/tidy_warnings_release.ref
    • ci/tidy_warnings_debug.ref
Edited by Narayanan Iyer

Merge request reports