Skip to content

[#860] Fix SIG-11 when boolean expressions occur inside extended reference using [] syntax

Narayanan Iyer requested to merge nars1/YDB:ydb860 into master

Background

  • This is pasted from #860 (comment 1079650087).

  • This comment is to track a longstanding issue identified by ongoing fuzz testing. This is an issue present even in the upstream GT.M versions.

  • Below is a simple test case demonstrating the issue.

    Release build

    YDB>lock +[(0!^|"x"|a)]x
    %YDB-F-KILLBYSIGSINFO1, YottaDB process 31691 has been killed by a signal 11 at address 0x00007FEB3DDC7E15 (vaddr 0x0000000000000008)
    %YDB-F-SIGMAPERR, Signal was caused by an address not mapped to an object

    Debug build

    YDB>lock +[(0!^|"x"|a)]x
    %YDB-F-ASSERT, Assert failed in sr_port/gvn.c line 188 for expression (NULL != TREF(expr_start))

Issue

  • Below is the stack trace from the assert failure

    (gdb) where
    .
    .
    #7  rts_error_csa (csa=0x0, argcnt=7) at sr_unix/rts_error.c:99
    #8  gvn () at sr_port/gvn.c:188
    #9  glvn (a=0x7ffde466d940) at sr_port/glvn.c:38
    #10 expratom (a=0x7ffde466d940) at sr_port/expratom.c:27
    #11 eval_expr (a=0x7ffde466dc00) at sr_port/eval_expr.c:248
    #12 expritem (a=0x7ffde466dc00) at sr_port/expritem.c:551
    #13 expratom (a=0x7ffde466dc00) at sr_port/expratom.c:29
    #14 expratom_coerce_mval (a=0x7ffde466dc00) at sr_port/expratom_coerce_mval.c:34
    #15 lkglvn (gblvn=0) at sr_port/lkglvn.c:63
    #16 nref () at sr_port/nref.c:40
    #17 m_lock () at sr_port/m_lock.c:93
    #18 cmd () at sr_port/cmd.c:312
    #19 linetail () at sr_port/linetail.c:35
    #20 op_commarg (v=0x5603a5cfe598, argcode=19 '\023') at sr_port/op_commarg.c:84
    #21 op_dmode () at sr_port/op_dmode.c:159
    
    (gdb) f 8
    #8  gvn () at sr_port/gvn.c:188
    188                             assert(NULL != TREF(expr_start));
  • The issue is that in frame number 8, we saw TREF(shift_side_effects) to be TRUE at line 55.

    sr_port/gvn.c

       55         if (shifting = (TREF(shift_side_effects) && (!TREF(saw_side_effect) || (YDB_BOOL == TREF(ydb_fullbool)
  • This caused the shifting variable to be set to TRUE.

  • And at the end of that function, we had to insert a OC_GVSAVTARG triple but found that TREF(expr_start) was NULL.

  • The issue is that TREF(expr_start) and TREF(shift_side_effects) were out of sync.

  • If TREF(shift_side_effects) was non-zero, then TREF(expr_start) should also have been non-NULL.

  • TREF(shift_side_effects) was set to 1 by frame number 11 in the below line.

    sr_port/eval_expr.c

      104                                 TREF(shift_side_effects) = TRUE;
  • And TREF(expr_start) was also set to a non-NULL value around then.

    sr_port/eval_expr.c

       95                                 TREF(expr_start) = TREF(expr_start_orig) = ref;
  • But the issue was that frame 11 gvn() invoke expr()

    sr_port/gvn.c

       69                         parse_status = expr(sb1++, MUMPS_EXPR);

    And that in turn did the following.

    sr_port/expr.c

       29         INCREMENT_EXPR_DEPTH;

    And this macro found TREF(expr_depth) set to 0 and therefore cleared TREF(expr_depth)

    sr_port/compiler.h

      420 #define INCREMENT_EXPR_DEPTH
      424         if (!(TREF(expr_depth))++)
      425                 TREF(expr_start) = TREF(expr_start_orig) = NULL;
  • Therefore, TREF(expr_start) was non-NULL when we entered frame 11 gvn() but was NULL towards the end of that function and that is the issue.

  • The real issue is that TREF(expr_depth) was 0 even though we were already evaluating a boolean expression (and doing shifting operations for global references).

  • And the cause of this is that there are 3 callers of eval_expr().

    • sr_port/bool_expr.c
    • sr_port/expr.c
    • sr_port/expritem.c
  • The first 2 of the above callers do a INCREMENT_EXPR_DEPTH before calling eval_expr().

  • But the 3rd caller does not. And that is where the issue lies.

  • It is not clear to me why this inconsistency was there all this while. I suspect it is an oversight instead of being intentional.

Fix

  • The fix is very simple and that is to call INCREMENT_EXPR_DEPTH (and DECREMENT_EXPR_DEPTH) in the 3rd caller sr_port/expritem.c before calling eval_expr(). This ensures TREF(expr_depth) stays a non-zero value in case TREF(expr_start) gets set to a non-NULL value inside eval_expr().

  • Additionally, I also added an assert in sr_port/eval_expr.c that if ever we set TREF(expr_start) to a non-NULL value, the TREF(expr_depth) global variable better be greater than 0.

Merge request reports