1. 06 Jul, 2020 1 commit
    • Joshua Nelson's avatar
      [#595] Implement yottadb -version · 500e33a8
      Joshua Nelson authored
      Example output:
      
      ```
      $ ./yottadb -version
      YottaDB release:         r1.29
      Upstream base version:   GT.M V6.3-008
      Platform:                Linux x86_64
      Build date/time:         2020-07-03 14:14
      Build commit SHA:        1110c868 (dirty)
      ```
      
      - Add delimiter between date components so that output has the format
      YYYY-MM-DD
      500e33a8
  2. 03 Jul, 2020 1 commit
  3. 29 Jun, 2020 2 commits
    • Narayanan Iyer's avatar
      [#596] Maintain $ZTDELIM (if -delim/-zdelim was specified) for... · 11cffd3a
      Narayanan Iyer authored
      [#596] Maintain $ZTDELIM (if -delim/-zdelim was specified) for KILL/ZKILL/ZTRIG trigger commands too
      
      * The main change is to ensure $ZTDELIM is maintained even for KILL/ZKILL/ZTRIG trigger commands.
      
      * In addition, `-PIECES` was previously tied to whether `-DELIM` and/or `-ZDELIM` was also specified.
        Instead, it is now tied to whether `SET` trigger command type is also specified. And an error is
        issued otherwise.
      11cffd3a
    • Joshua Nelson's avatar
      Fix pre-commit hook to not require a dependency on env 8.30 · 260e4dac
      Joshua Nelson authored
      Previously, the pre-commit script would give an error on Ubuntu 18.04
      and any other OS which did not have an extremely recent version of env:
      
      ```
      /usr/bin/env: invalid option -- 'S'
      Try '/usr/bin/env --help' for more information.
      ```
      
      The -S parameter was introduced to allow passing arguments to tcsh while
      still allowing it to be anywhere in $PATH. However the only argument
      being passed is `-f`, which only speeds up the start time slightly and
      does not affect the behavior. So it can be removed without breaking
      anything.
      260e4dac
  4. 26 Jun, 2020 2 commits
    • Joshua Nelson's avatar
      [#590] Add git commit to CMake release stamp · 694e26f2
      Joshua Nelson authored
      This will be updated whenever `make` is run and the git commit has
      changed. Example output:
      
      ```
      $ ./yottadb -dir <<< 'zwr $ZRELDATE'
      
      YDB>
      $ZRELDATE="20200624 12:02 b03d4520 (dirty)"
      ```
      
      - Add git to dockerfile
      - Don't rename release_name.h
      
      Since the preprocessed file is output to a different directory, there is
      no need to change the name. This makes rebasing over upstream GT.M
      changes much easier.
      
      - Update example release stamp
      - Update dependencies in CMake
      - Use different file for YDB_RELEASE_STAMP to work around dependency issues
      
      Although CMake will regenerate the files properly if they are named the
      same (for instance: `sr_linux/release_name.h` and `build/release_name.h`),
      they will not be handled correctly by the preprocessor. Trying to put
      `#define YDB_RELEASE_STAMP      "@[email protected]"` directly in
      `sr_linux/release_name.h` led to the build file being regenerated every
      time the commit changed, but also lead to `YDB_RELEASE_STAMP` being
      defined as the literal string `@[email protected]`. My belief is that
      the preprocessor was using the file in the same directory as the source
      files instead of the file in the build directory.
      
      To work around this, a new file is introduced which is not present in
      the source directory. This forces the preprocessor to look in the build
      directory instead.
      
      - Use `--short` for git commit
      - Add git to debian install
      - Use cmake3 on centos
      
      This avoids strange dependency issues when generating the git commit.
      
      - Use find_program instead of find_package
      
      This avoids the following CentOS-only issue:
      
      ```
      CMake Error at CMakeLists.txt:416 (target_link_libraries):
        IMPORTED library can only be used with the INTERFACE keyword of
        target_link_libraries
      Call Stack (most recent call first):
        /usr/share/cmake3/Modules/FindGit.cmake:90 (add_executable)
        cmake/git-watcher.cmake:71 (find_package)
        CMakeLists.txt:1006 (include)
      ```
      
      - Don't set `GIT_NOTFOUND` in $ZRELDATE if git wasn't found
      
      Before:
      ```
      YDB>zwr $ZRELDATE
      $ZRELDATE="20200625 20:50 GIT-NOTFOUND"
      ```
      
      After:
      ```
      YDB>zwr $ZRELDATE
      $ZRELDATE="20200625 21:02  "
      ```
      
      - Add a warning with debug information if git failed to report status
      - Make check_git_repository a dependency of libyottadb and libmumps directly
      
      Previously it was a dependency of the yottadb executable, which meant
      code duplication and also prevented building `libyottadb` without also
      building the executable.
      
      I also tried making it a dependency of libyottadb only, but that led to
      the following intermittent dependency issue:
      
      ```
      In file included from /tmp/yottadb-src/sr_x86_64/obj_filesp.c:56:
      /tmp/yottadb-src/sr_linux/release_name.h:29:11: fatal error: release_commit.h: No such file or directory
       # include "release_commit.h"
                 ^~~~~~~~~~~~~~~~~~
      ```
      
      - Add instructions for installing cmake3
      694e26f2
    • Joshua Nelson's avatar
      Improve pre-commit hook · df8ff69b
      Joshua Nelson authored
      - Don't require tcsh to be in /usr/local/bin (on my machine it was in
      /usr/bin)
      - Document how to set up the pre-commit hook
      - Mention that you have to be in the top-level directory for the example
      setup
      df8ff69b
  5. 25 Jun, 2020 1 commit
    • Narayanan Iyer's avatar
      [#594] Fix ydb_incr_s() to buffer lvn value in stringpool as caller buffer is... · 623dc00d
      Narayanan Iyer authored
      [#594] Fix ydb_incr_s() to buffer lvn value in stringpool as caller buffer is not guaranteed to persist
      
      lvn tree should not have any nodes with values pointing to SimpleAPI caller buffers as they are not
      guaranteed to persist. If those buffers are later freed, the lvn tree will be pointing to garbage
      memory resulting in SIG-11 or incorrect results.
      
      We noticed the SIG-11 in the `go/randomWalk` subtest (very rare as it happened once in hundreds of
      test runs).  The incorrect result part is easily reproducible with a `ydb_dbglvl` env var of `0x40`
      and is exercised by the `r130/ydb594` subtest.
      623dc00d
  6. 24 Jun, 2020 2 commits
    • Narayanan Iyer's avatar
      Add RTLD_GLOBAL flag to dlopen() of a YottaDB-plugin (avoids "undefined symbol" errors) · bfd73f11
      Narayanan Iyer authored
      Description
      ------------
      This is an issue that was encountered by Stefan Traby and the fix was provided by him.
      
      Below is his description of the issue.
      
      1) I'm working on a perl-plugin for YottaDB.
      2) I got the proof-of-concept up and running quite fast by using the YDBPosix-plugin as build-template.
      3) What I didn't get right without a YottaDB-patch is the perl "Dynaloader" which loads .so-libs at runtime.
         It issued the following error.
      
         ```sh
         /usr/local/lib/yottadb/r129/yottadb: symbol lookup error: /usr/lib/x86_64-linux-gnu/perl/5.28/auto/Fcntl/Fcntl.so: undefined symbol: Perl_xs_handshake
         ```
      
      Fix
      ----
      The fix is to add `RTLD_GLOBAL` in the `dlopen()` in `fgn_getpak()` function (defined in `sr_unix/fgn_getinfo.c`).
      
      The `dlopen()` has been that way since day one and I don't see any reason why `RTLD_GLOBAL` would make it
      worse. Given that it helps in at least one known case now, it is considered okay to add it.
      
      Pasted below are the `dlopen()` in the entire code base BEFORE this commit. All of them already have
      `RTLD_GLOBAL` except for 2. One of them is being fixed in this commit. The other is in
      `sr_linux/hugetlbfs_overrides.c`. In that case, we are interested in only 2 symbols and we do `dlsym()`
      calls for them right away so there is no need of `RTLD_GLOBAL` seen at this point.
      
      ```c
      ./sr_linux/hugetlbfs_overrides.c:	handle = dlopen(lpath, RTLD_NOW);
      ./sr_unix/fgn_getinfo.c:	ret_handle = dlopen(lpath, RTLD_LAZY);
      ./sr_unix/gtm_zlib.c:	handle = dlopen( lpath, ZLIB_LIBFLAGS | RTLD_MEMBER);
      ./sr_unix/gtmcrypt_util.h:	handle = dlopen(NULL, (RTLD_NOW | RTLD_GLOBAL));								\
      ./sr_unix/gtm_tls_loadlibrary.c:	if (NULL == (handle = dlopen(libpath, RTLD_GLOBAL | RTLD_NOW)))
      ./sr_unix/gtmcrypt_entry.c:	handle = dlopen(&resolved_libpath[0], RTLD_NOW | RTLD_GLOBAL);
      ./sr_unix/dlopen_libyottadb.c:	handle = dlopen(&libyottadb_realpath[0], (RTLD_NOW | RTLD_GLOBAL));
      ```
      bfd73f11
    • Narayanan Iyer's avatar
      [DEBUG] Remove a rarely incorrect assert in COMPSWAP_UNLOCK macro · 39e5c98c
      Narayanan Iyer authored
      * The below M program spawns off around 1000 child processes each of which immediately quit.
        This uses `com/job.m` from the YDBTest repo which does M lock operations in the parent
        and child processes.
      
        ```m
        $ cat x.m
        x       ;
                do ^job("child^x",500+$random(1000),"""""")
                quit
        child   ;
                quit
        ```
      
        When running this M program around a 100 times or so, some of the child processes were found
        to terminate abnormally with an assert failure around 5% of the test runs.
      
        ```
        %YDB-F-ASSERT, Assert failed in sr_port/mlk_garbage_collect.h line 59 for expression ((&ctl->lock_gc_in_progress)->u.parts.latch_pid == old_gc)
        ```
      
        With the following C-stack trace.
      
        ```c
        (gdb) where
        #0  pthread_kill () from /usr/lib/libpthread.so.0
        #1  gtm_dump_core () at sr_unix/gtm_dump_core.c:74
        #2  gtm_fork_n_core () at sr_unix/gtm_fork_n_core.c:163
        #3  ch_cond_core () at sr_unix/ch_cond_core.c:80
        #4  rts_error_va (csa=0x0, argcnt=7, var=0x7ffed96b4e80) at sr_unix/rts_error.c:192
        #5  rts_error_csa (csa=0x0, argcnt=7) at sr_unix/rts_error.c:99
        #6  prepare_for_gc (pctl=0x5558188b1340) at sr_port/mlk_garbage_collect.h:59
        #7  mlk_lock (p=0x5558188b1340, auxown=0, new=0 '\000') at sr_port/mlk_lock.c:114
        #8  op_lock2_common (timeout=9223372036854775800, laflag=0 '\000') at sr_port/op_lock2.c:383
        #9  op_lock2 (timeout=0x7fdc779f7070, laflag=0 '\000') at sr_port/op_lock2.c:158
        #10 op_lock (timeout=0x7fdc779f7070) at sr_port/op_lock.c:36
      
        (gdb) f 6
        #6  0x00007f8ee778b1a8 in prepare_for_gc (pctl=0x55b034213c40) at /Distrib/YottaDB/V999_R129/sr_port/mlk_garbage_collect.h:59
        59                              COMPSWAP_UNLOCK(&ctl->lock_gc_in_progress, old_gc, LOCK_AVAILABLE);
      
        (gdb) p ctl->lock_gc_in_progress->u.parts.latch_pid
        0
      
        (gdb) p old_gc
        $2 = 37247
      
        (gdb) p process_id
        $3 = 37019
        ```
      
        The assert failure is because the process in `prepare_for_gc()` P1 found a pid P2 to be holding
        the latch and later found P2 exited. So P1 invoked the `COMPSWAP_UNLOCK` macro on the latch using
        P1 as the old value of the latch. But it turns out that the latch had already been released.
        This is because P2 had released it as part of its exit processing.
      
        The assert in COMPSWAP_UNLOCK is incorrect in this case because this is a situation where the latch
        holder is not releasing it but a completely unrelated process is trying to salvage the latch thinking
        it is being held by a dead process.
      
        This assert was added to the COMPSWAP_UNLOCK macro as part of #61 (commit @26391b73). While this
        assert was true at that time, since then a few cases were added in concurrent GT.M releases
        (e.g. `jnl_write_attempt()`, `prepare_for_gc()`) where lock salvaging code used COMPSWAP_UNLOCK.
      
        In pro code, the COMSWAP_UNLOCK macro does the right thing by releasing the latch/lock only if CMPVAL
        passed in is identical to the latch value.
      
        Therefore there is no real issue. Since the assert is not accurate in rare cases, and since the pro
        code already does the right thing, that portion of the assert is now removed.
      
        With the changes, the test was run 100 times again and this time around there were no assert failures.
      39e5c98c
  7. 22 Jun, 2020 1 commit
    • Narayanan Iyer's avatar
      [#592] Use EXECVPE instead of EXECVE to handle executable names without absolute path name · d8bcad99
      Narayanan Iyer authored
      * The YDBOcto pipeline invoked `yottadb` executable without using an absolute path name (i.e. no
        `$ydb_dist/yottadb` but just `yottadb` where `$ydb_dist` is defined in the PATH env var).
        In this scenario, the changes to the JOB command in a prior commit (3701885d) resulted in a
        pipeline failure with the following diff.
      
        ```
        %YDB-E-JOBFAIL, JOB command failure
        %YDB-I-TEXT, Exec error in Job
        %SYSTEM-E-ENO2, No such file or directory
        ```
      
        This is because the `execve()` call in `ojstartchild.c` was being done with a relative path
        in this case (just `yottadb` instead of `$ydb_dist/yottadb`) and `execve()` does not look into
        the PATH env var to find the absolute path of `yottadb` and so fails with an ENOENT/ENO2 error.
      
        The call that does look at PATH is `execvpe()`. So that is now used.
      
      * This use case has also been added as an automated test in the `r130/ydb592` subtest as part of
        [email protected].
      d8bcad99
  8. 21 Jun, 2020 1 commit
    • Narayanan Iyer's avatar
      [#592] ps should identify JOB'd process with the same name as the parent · 3701885d
      Narayanan Iyer authored
      The main change is to `sr_unix/ojstartchild.c` to check if the base program `invocation_mode`
      is `MUMPS_RUN` or `MUMPS_DIRECT`. If so, we use the original executable name (based on a new
      global variable `invocation_exe_str` if it is non-NULL) as the executable for the jobbed child.
      
      There was one issue with this as `argv[0]` gets changed by `dlopen_libyottadb()` before invoking
      `gtm_main()`. The invocation happens with symlinks resolved which defeats the requirement for #592
      to track the command line argument specified by the user as is (e.g. "mumps" or "yottadb").
      
      For this purpose, `sr_unix/dlopen_libyottadb.c` was changed to not tamper with `argv[0]` in case
      the invocation is for `gtm_main()`. But this tampering is needed for a later call to `ydb_chk_dist()`
      and so that call was changed to use the symlink-expanded full path name (obtained from `realpath()`)
      after having noted down the untampered `argv[0]` in a new global variable `invocation_exe_str`.
      3701885d
  9. 15 Jun, 2020 1 commit
    • Narayanan Iyer's avatar
      [#589] Flush buffered IO writes inside external call when YottaDB process output is piped · 7bcf75be
      Narayanan Iyer authored
      Background
      -----------
      This issue was noticed as part of writing a test case for
      YottaDB/DBMS/YDBOcto!664.
      
      YottaDB has different behavior when stdout is piped and when it isn't! When stdout is not piped,
      the `Hello, world!` string shows up. But when stdout is piped, the string does not show up as can be
      seen below.
      
      ```sh
      $ cc -c -fPIC -I$ydb_dist helloworld.c
      $ cc -o helloworld.so -shared helloworld.o
      $ yottadb helloworld.m
      $ LD_LIBRARY_PATH=".:$LD_LIBRARY_PATH" ydb_xc_c=TC01.xc ydb_routines=". $ydb_routines" yottadb -run helloworld
      Hello, world!
      $ LD_LIBRARY_PATH=".:$LD_LIBRARY_PATH" ydb_xc_c=TC01.xc ydb_routines=". $ydb_routines" yottadb -run helloworld | tee output.txt
      $
      ```
      
      Below are the contents of the relevant external call file and the corresponding C and M programs
      needed to reproduce the above behavior.
      
      ```m
      $ cat helloworld.m
       DO &c.helloworld
       QUIT
      
      $ cat helloworld.xc
      libhelloworld.so
      helloworld: void hello_world()
      ```
      
      ```c
      $ cat helloworld.c
      
      void hello_world() {
          puts("Hello, world!");
      }
      ```
      
      Fix
      ----
      The reason for no output in the piped case is that stdout is opened by YottaDB as file descriptor
      1 (stdout), using unbuffered IO, at process startup but the external call writes to the same file
      descriptor using `puts()` call which does buffered IO to it. There is a mixing of unbuffered and
      buffered IO writes to the same file descriptor by the M and C sections of the same process. And when
      YottaDB is about to terminate, it closes all open IO devices without realizing the `puts()` done by
      the external call is not yet flushed. Later as part of process exit handling, the operating system
      tries to do the flush but it notices the file descriptor 1 has already been closed and the flush
      fails with an `EBADF` error code.
      
      While in general it is not a good idea to mix unbuffered and buffered IO, in this particular case,
      YottaDB could have been more user friendly by doing a `fflush(NULL);` before it started closing any
      of its open file descriptors. That would ensure any writes done by non-YottaDB code (e.g. external
      calls etc.) in the same process do get flushed before any file descriptors tied to that buffered IO
      structure get closed by YottaDB.
      
      In YottaDB code, the convention is to use `FFLUSH` whenever `fflush` usage is desired. Hence the
      change is to add a `FFLUSH(NULL);` call to `io_rundown()` before closing any open file descriptors.
      7bcf75be
  10. 12 Jun, 2020 4 commits
    • Narayanan Iyer's avatar
      [#488] Various fixes (keep fast path unaffected, coding guidelines etc.) · c3aec54a
      Narayanan Iyer authored
      * Ensure fast path (when gbldir xlate is not needed due to env var not defined) has no extra processing
        Affected modules are dpgbldir.c and dpgbdlir_sysops.c
      * Remove debug changes.
      * Decide to document $VIEW("GBLDIRXLATE").
      * Coding guidelines related cosmetic fixes
      c3aec54a
    • Tomas Morstein's avatar
      [#488] Init-time gbldir/env xlate for all YDB utilities · 468b7755
      Tomas Morstein authored
      * Global directory and environment translation to happen at init
        time for all the YDB binaries (MUMPS/MUPIP/DSE/LKE) consistently
      
      * Implement `$VIEW("GBLDIRXLATE")` so that symbolic values are
        correctly recognised and translated even in GDE
      
      * `$R[eference]` to have an untraslated global directory value
        (This is more for consistency. Should not have any performance
        inpact since the translate would be called anyway.)
      468b7755
    • Narayanan Iyer's avatar
      [#586] SYSCALL error in rare cases when MUPIP INTRPT (SIGTERM/SIG-15) is sent to a YottaDB process · 0287e8c0
      Narayanan Iyer authored
      * It is possible a timer interrupt comes in while we are canceling the timer in `sys_canc_timer()`
        (invoked in `generic_signal_handler()`). This can cause problems since we might end up trying to
        start a posix system timer on a non-existing timer id (as shown by the below C-stack we saw in
        a test failure).
      
        ```gdb
        (gdb) where
        #0  __pthread_kill (threadid=<optimized out>, signo=3) at ../sysdeps/unix/sysv/linux/pthread_kill.c:57
        #1  gtm_dump_core () at sr_unix/gtm_dump_core.c:74
        #2  gtm_fork_n_core () at sr_unix/gtm_fork_n_core.c:163
        #3  ch_cond_core () at sr_unix/ch_cond_core.c:80
        #4  rts_error_va (csa=0x0, argcnt=7, var=...) at sr_unix/rts_error.c:192
        #5  rts_error_csa (csa=0x0, argcnt=7) at sr_unix/rts_error.c:99
        #6  sys_settimer (tid=1978083808, time_to_expir=0x7eeeb62c) at sr_unix/gt_timers.c:564
        #7  start_first_timer (curr_time=0x7eeeb6fc) at sr_unix/gt_timers.c:633
        #8  timer_handler (why=14, info=0x76bad060 <stapi_signal_handler_oscontext+8808>, context=0x76bad0e0 <stapi_signal_handler_oscontext+8936>) at sr_unix/gt_timers.c:853
        #9  <signal handler called>
        #10 timer_delete (timerid=0x823a38) at ../sysdeps/unix/sysv/linux/timer_delete.c:38
        #11 sys_canc_timer () at sr_unix/gt_timers.c:1041
        #12 generic_signal_handler (sig=15, info=0x76babbc0 <stapi_signal_handler_oscontext+3528>, context=0x76babc40 <stapi_signal_handler_oscontext+3656>) at sr_unix/generic_signal_handler.c:401
        #13 <signal handler called>
        #14 write () at ../sysdeps/unix/syscall-template.S:84
        #15 iorm_wteol (x=1, iod=0x83c420) at sr_unix/iorm_wteol.c:226
        #16 write_text_newline_and_flush_pio (text=0x7eeec298) at sr_port/flush_pio.c:128
        #17 util_out_print_vaparm (message=0x76ab1864 "Blocks coalesced    : !SL ", flush=1, var=..., faocnt=2147483647) at sr_unix/util_output.c:872
        #18 util_out_print (message=0x76ab1864 "Blocks coalesced    : !SL ", flush=1) at sr_unix/util_output.c:913
        #19 reorg_finish (dest_blk_id=6003, blks_processed=1, blks_killed=0, blks_reused=0, file_extended=0, lvls_reduced=0, blks_coalesced=0, blks_split=0, blks_swapped=0) at sr_port/mu_reorg.c:720
        #20 mu_reorg (gl_ptr=0x10bdca0, exclude_glist_ptr=0x7eeed5a8, resume=0x7eeed4c4, index_fill_factor=100, data_fill_factor=100, reorg_op=0) at sr_port/mu_reorg.c:556
        #21 mupip_reorg () at sr_port/mupip_reorg.c:283
        #22 mupip_main (argc=2, argv=0x7eef7914, envp=0x7eef7920) at sr_unix/mupip_main.c:122
        #23 dlopen_libyottadb (argc=2, argv=0x7eef7914, envp=0x7eef7920, main_func=0x115f4 "mupip_main") at sr_unix/dlopen_libyottadb.c:148
        #24 main (argc=2, argv=0x7eef7914, envp=0x7eef7920) at sr_unix/mupip.c:22
      
        (gdb) f 6
        #6  0x75df09f0 in sys_settimer (tid=1978083808, time_to_expir=0x7eeeb62c) at sr_unix/gt_timers.c:564
        564                     assert(WBTEST_ENABLED(WBTEST_SETITIMER_ERROR));
        (gdb) list
        559             assert(sys_timer.it_value.tv_sec || sys_timer.it_value.tv_nsec);
        560             sys_timer.it_interval.tv_sec = sys_timer.it_interval.tv_nsec = 0;
        561             if ((-1 == timer_settime(posix_timer_id, 0, &sys_timer, &old_sys_timer)) || WBTEST_ENABLED(WBTEST_SETITIMER_ERROR))
        562             {
        563                     save_errno = errno;
        564                     assert(WBTEST_ENABLED(WBTEST_SETITIMER_ERROR));
        565                     WBTEST_ONLY(WBTEST_SETITIMER_ERROR,
        566                             save_errno = EINVAL;
        567                     );
        568                     rts_error_csa(CSA_ARG(NULL) VARLSTCNT(8)
        569                                             ERR_SYSCALL, 5, RTS_ERROR_LITERAL("timer_settime()"), CALLFROM, save_errno);
      
        (gdb) p save_errno
        $1 = 22
        ```
      
        The fix is to remove the `sys_canc_timer()` call in `generic_signal_handler()` as it is not clear to me what
        purpose it serves. Later in exit handling (in `gtm_exit_handler()` etc.), we anyways do a call to
        `CANCEL_TIMERS` to cancel any active unsafe timers. This is a safer way of doing the `sys_canc_timer()`
        (as it blocks SIGALRM).
      
      * That said, as part of the code review @estess indicated that he remembered this as being necessary for some
        reason when we were about to dump a core due to a fatal signal (e.g. assert etc.). Therefore, I have
        added code to block SIGALRM only in that code path even though similar code also exists and would be invoked
        a little later in `sr_unix/gtm_fork_n_core.c`.
      0287e8c0
    • Narayanan Iyer's avatar
      Fix issue where trigger options that were not specified could be incorrectly loaded · 596f7e67
      Narayanan Iyer authored
      * In internal testing, we noticed the triggers/mupiptrigger subtest fail (only once) with the
        following diff.
      
        ```diff
        24c24
        < +^a(:2;5:) -commands=S -options=I -delim="~`" -pieces=1 -xecute="do ^tkilla"
        ---
        > +^a(:2;5:) -commands=S -options=I,C -delim="~`" -pieces=1 -xecute="do ^tkilla"
        ```
      
        The trigger definition that was provided as input was the following. It had only `isolation` in it.
      
        ```
        +^a(:2;5:) -command=s -xecute="do ^tkilla" -delim=$zchar(126)_$char(96) -piece=1 -options=isolation
        ```
      
        But after a `mupip trigger -triggerfile=` command loaded the above triggers, a `mupip trigger -select`
        showed the options as `I,C` (`I` stands for `isolation` and `C` stands for `consistency`) when we only
        expected to see `I`.
      
      * This is because of a day-one bug in the `process_options()` function in `sr_unix/trigger_parse.c` where
        we were not null terminating the user input string before passing it to `strtok_r()`. This meant
        `strtok_r()` could process more than what the user specified depending on whatever garbage memory
        existed at the end of the string.
      
      * This is actually a user-visible issue in the sense one can see the options specified did not get loaded
        correctly. But this is not a user-visible issue in the sense none of those options are currently
        implemented in the trigger logic. They are just placeholders for future implementation. Therefore no
        user should be really affected by this bug.
      
        Hence the decision to not create an issue for this on gitlab. Note that this issue exists even in the
        upstream repo (GT.M).
      596f7e67
  11. 11 Jun, 2020 2 commits
    • Brad Westhafer's avatar
      [#390] Fix sig-11 in $ZYHASH on ARM due to unaligned accesses · 47d3879f
      Brad Westhafer authored
      In our automated testing, the r130/ydb390 test has occassionally sig-11ed on 32 bit ARM machines. The sig-11 happened in the MurmurHash3_x64_128() call when it attempted to access a valid subscript of string->str.addr. This failure was likely to due to an unaligned access that isn't fully supported on ARM.  This change fixes this failure by defining UNALIGNED_ACCESS_FULLY_SUPPORTED when the architecture fully supports unaligned access but not on 32 bit ARM machines.
      47d3879f
    • Ganesh Mahesh's avatar
      [#569] New MUPIP LOAD option to ignore chset · 0f19703c
      Ganesh Mahesh authored
      * New option -ignorechset is added for MUPIP LOAD
      
      * When the option is used error check performed for CHSET missmatch is no longer done
      
      * CHSET missmatch check, for binary formatted file mupip load is removed in this change
      
      * GO formatted file mupip load originally did not have a mode missmatch check so nothing is done in this regard
      
      * Unused utf8 function parameter in go_load is removed
      0f19703c
  12. 07 Jun, 2020 1 commit
  13. 03 Jun, 2020 1 commit
    • Narayanan Iyer's avatar
      [#584] Errors in OPEN command should also identify the file name that encountered the error · 4e7a0338
      Narayanan Iyer authored
      * All error code paths in `sr_unix/io_open_try.c` (the central module that is invoked by an OPEN command)
        now issue a `DEVOPENFAIL` as the primary error. This message identifies the file name involved.
        A secondary error is additionally issued with the error detail (e.g. `Permission denied` etc.).
        Previously only the secondary error used to be issued which did not have the context that this is
        an error in the OPEN command.
      
      * A consequence of the above change is that in the case where one attempts an OPEN command on a directory,
        a `DEVOPENFAIL` primary error is issued followed by a `GTMEISDIR` secondary error both of which identify
        the file name (this was the only error code path when previously the file name used to be identified
        in `$ZSTATUS`). `GTMEISDIR` used to be the primary error before but is now the secondary error. This
        is considered an acceptable change even though it might seem backwards incompatible since this is an
        error scenario and it seems more consistent to identify all OPEN command errors with a DEVOPENFAIL error.
      4e7a0338
  14. 02 Jun, 2020 1 commit
  15. 28 May, 2020 2 commits
    • Narayanan Iyer's avatar
      Work around ipcs utility regression in 2.35.1 by filtering out duplicate lines · 51eff1cb
      Narayanan Iyer authored
      * Internal testing revealed a failure in the `rundown_argless/gtm7010` subtest where a `MUJPOOLRNDWNFL`
        message showed up lot more times than expected. Turns out that the system where this failure happened
        was running ipcs version 2.35.1 where we noticed duplicate lines in the `ipcs -m` or `ipcs -s` output.
        The number of lines per unique ipc varied from 1 to many dozens. This was first noticed early May 2020
        and continues to stay the same way end May 2020 so not sure when this will get fixed.
      
        In the meanwhile this is worked around by doing a `sort -u` on top of the output that ipcs returns.
      
        This fix was needed in the following places.
          - sr_unix/mu_rndwn_file.h
          - sr_unix/ydbenv.mpt
      51eff1cb
    • Narayanan Iyer's avatar
      Fix -Wreturn-local-addr and -Wstringop-overflow= linker warnings in YottaDB build using gcc 10.1.0 · 4a32c9a9
      Narayanan Iyer authored
      * Below are the warnings that showed up on gcc 10.1.0 at link time (not compile time).
      
        ```c
        sr_port/gvsub2str.c: In function 'gvsub2str':
        sr_port/gvsub2str.c:209:9: warning: function may return address of local variable [-Wreturn-local-addr]
          209 |  return targ;
              |         ^
        sr_port/gvsub2str.c:57:16: note: declared here
           57 |  unsigned char buf[MAX_KEY_SZ + 1], buf1[MAX_KEY_SZ + 1], ch, *ptr, trail_ch, *str, *targ, *targ_end;
              |                ^
        sr_port/mval2subsc.c: In function 'mval2subsc':
        sr_port/mval2subsc.c:267:13: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
          267 |  *out_ptr++ = cvt_table[mx];
              |             ^
        sr_port/gdsfhead.h:2869:16: note: at offset 1 to object 'base' with size 1 declared here
         2869 |  unsigned char base[1]; /* Base of the key */
              |                ^
        sr_port/mval2subsc.c:183:15: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
          183 |    *out_ptr++ = cvt_table[mt];
              |               ^
        sr_port/gdsfhead.h:2869:16: note: at offset 1 to object 'base' with size 1 declared here
         2869 |  unsigned char base[1]; /* Base of the key */
              |                ^
        sr_port/mval2subsc.c:177:15: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
          177 |    *out_ptr++ = cvt_table[mt * 10];
              |               ^
        sr_port/gdsfhead.h:2869:16: note: at offset 1 to object 'base' with size 1 declared here
         2869 |  unsigned char base[1]; /* Base of the key */
              |                ^
        sr_port/mval2subsc.c:183:15: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
          183 |    *out_ptr++ = cvt_table[mt];
              |               ^
        sr_port/gdsfhead.h:2869:16: note: at offset 1 to object 'base' with size 1 declared here
         2869 |  unsigned char base[1]; /* Base of the key */
              |                ^
        sr_port/mval2subsc.c:177:15: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
          177 |    *out_ptr++ = cvt_table[mt * 10];
              |               ^
        sr_port/gdsfhead.h:2869:16: note: at offset 1 to object 'base' with size 1 declared here
         2869 |  unsigned char base[1]; /* Base of the key */
              |                ^
        sr_port/dse_getki.c: In function 'dse_getki.constprop':
        sr_port/dse_getki.c:138:12: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
          138 |     *ptr++ = KEY_DELIMITER;
              |            ^
        sr_port/gdsfhead.h:2869:16: note: at offset 3 to object 'base' with size 1 declared here
         2869 |  unsigned char base[1]; /* Base of the key */
              |                ^
        sr_port/dse_getki.c:139:10: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
          139 |     *ptr = KEY_DELIMITER;
              |          ^
        sr_port/gdsfhead.h:2869:16: note: at offset 4 to object 'base' with size 1 declared here
         2869 |  unsigned char base[1]; /* Base of the key */
              |                ^
        ```
      
      * After some review of the code, I don't see any issues there. So this seems to be a case of
        gcc/ld/gold triggering an over-cautious warning when none was necessary.
      
      * It does not seem easy to fix the code to avoid the warning so these warning options are disabled
        for now. We might need to revisit this at a later point in time if it is found that these warnings
        no longer show up in a later version of gcc/ld/gold.
      4a32c9a9
  16. 23 May, 2020 2 commits
    • Brad Westhafer's avatar
      [#566] Add support for comments in Call-In table and External Calls · 1748dc2d
      Brad Westhafer authored
      This commit adds support for single line C-style comments in Call-In tables and External Calls tables. Anything that appears in a Call-In table or External Call table line after // is now treated as a comment.
      1748dc2d
    • K.S. Bhaskar's avatar
      [#579] Standard procstuckexec command · b6a5ef98
      K.S. Bhaskar authored
      Set ydb_procstuckexec to "$ydb_dist/yottadb -run %YDBPROCSTUCKEXEC" to use
      Creates output file in $ydb_log, $gtm_log, $ydb_tmp, $gtm_tmp, which ever is defined, in that order else /tmp
      containing:
      - command line of blocking pid
      - environment of blocking pid
      - current value of /proc/sys/kernel/yama/ptrace_scope
      - gdb snapshot of blocking pid
      Then it sends a USR1 to blocdking pid.
      Sourcing ydb_env_set sets ydb_procstuckexec and gtm_procstuckexec to $ydb_dist/yottadb -run %YDBPROCSTUCKEXEC
      b6a5ef98
  17. 22 May, 2020 2 commits
    • Narayanan Iyer's avatar
      [#492] Fix SIG-11 from $TRANSLATE using multi-byte string literals in UTF-8... · 724c73bc
      Narayanan Iyer authored
      [#492] Fix SIG-11 from $TRANSLATE using multi-byte string literals in UTF-8 mode if executed from shared library
      
      * The issue was that `activate_hashtab_in_buffer_int4()` (invoked from `op_fntranslate.c`) modified the
        header structure part of the hash table stored in the M object code as a literal (generated by
        `f_translate.c`). This is needed to set up pointers to make the hash table usable at runtime.
      
        While this is okay to do in M object code that is in process private read-write memory, it is not
        allowed if the M object code comes from a shared library (the process only has read-only access to it).
      
        Since a similar issue exists for M object code shared in autorelink shared memory, it was decided that
        it is better to keep the logic simple and separate out the hash table header structure into process
        private memory and use the hash table (minus the hash table header) from the M object code in all
        cases (including the case when M object code points to process private memory).
      
        This meant an interface change in the `activate_hashtab_in_buffer_int4()` function. It previously had
        an unused second parameter (which was passed in as NULL) so that was removed and instead a pointer to
        the separate hash table header structure is passed in from the caller.
      
        Associated interface changes were needed to other related (but otherwise undefined/unused) functions
        (e.g. `activate_hashtab_in_buffer_int8()` etc.).
      
        The function `ACTIVATE_HASHTAB_IN_BUFFER()` in `sr_port/hashtab_implementation.h` was changed to
        implement the new interface and fill in the second parameter with the hash table header structure.
      724c73bc
    • Joshua Nelson's avatar
      [DEBUG] Fix DBG_VERIFY_SOCK_IS_BLOCKING macro to check error return code from fcntl · 046b8af7
      Joshua Nelson authored
      Before, YDB (in debug mode only) would give an assertion failure when calling `DBG_VERIFY_SOCK_IS_BLOCKING` on a closed socket. Now, YDB instead will print an error message if the `fcntl` call fails.
      
      Previous behavior:
      ```
      rocto: gtm_tls_impl.c:1856: gtm_tls_socket_close: Assertion `0 == (O_NONBLOCK & flags)' failed.
      %YDB-F-KILLBYSIGSINFO1, YottaDB process 423 has been killed by a signal 6 at address
      0x00000000060CAE97 (vaddr 0x000003E8000001A7)
      ```
      
      New behavior:
      ```
      %YDB-E-GETSOCKOPTERR, fcntl on SOCKFD failed: Bad file descriptor
      ```
      046b8af7
  18. 21 May, 2020 3 commits
    • Narayanan Iyer's avatar
      [#578] $ZSTATUS after a LVUNDEF error should contain double quotes around string subscripts · fa70913c
      Narayanan Iyer authored
      * The function `format_key_mvals()` was copying mvals (corresponding to subscripts) as is without
        checking for numeric vs string vs $zysqlnull vs control characters. All this handling is already
        taken care of by `op_fnzwrite(()` so that is now invoked to do the conversion. This takes care
        of printing the lvn with proper subscripts in a LVUNDEF error.
      
        For example, below is a new LVUNDEF output.
      
        ```m
        YDB>write x(1,"abcd",$zysqlnull,$c(2,3))
        %YDB-E-LVUNDEF, Undefined local variable: x(1,"abcd",$ZYSQLNULL,$C(2,3))
        ```
      fa70913c
    • Narayanan Iyer's avatar
      [#580] timer_id parameter needs to be of type intptr_t (and not int) in... · d5166362
      Narayanan Iyer authored
      [#580] timer_id parameter needs to be of type intptr_t (and not int) in ydb_timer_cancel()/ydb_timer_cancel_t()/ydb_timer_start_t()
      
      * #428 (closed in r1.26) took care of fixing the `timer_id` variable type in `ydb_timer_start()` to be `intptr_t`.
        But similar changes are necessary for related timer functions `ydb_timer_cancel()`, `ydb_timer_cancel_t()` and
        `ydb_timer_start_t()`.
      d5166362
    • Narayanan Iyer's avatar
      Fix build failure with newer gold linkers due to duplicate definition of various global variables · 78b03d7d
      Narayanan Iyer authored
      * Various function prototypes in `sr_unix/ydb_tls_interface.h` and `sr_unix/ydbcrypt_interface.h`
        were being included by lots of C files. But each of these function prototypes expanded to use
        the corresponding function pointers.
      
        For example, the below function prototype
      
        ```c
        gtm_status_t	gtmcrypt_init(gtm_int_t flags);
        ```
      
        got expanded to the following prototype by the preprocessor
      
        ```c
        gtm_status_t	(*gtmcrypt_init_fnptr)(gtm_int_t flags);
        ```
      
        This ended up causing the function pointer variable `gtmcrypt_init_fnptr` to be defined as a global
        variable in the C file that included one of the above mentioned 2 header files. This resulted in
        multiple definitions of those function pointers.
      
        This is now fixed by prefixing the function prototype with a GBLREF which effectively translates to
        `extern`. That declares (but does not define) the global variable thereby avoiding the multiple
        definition issue.
      
      * Various global variables were defined using `GBLDEF` (global definition) in multiple programs. That
        is now fixed by converting all but one of those into a `GBLREF` (global declaration). In one case
        the global variable was unused so the definition was completely removed.
      78b03d7d
  19. 18 May, 2020 1 commit
    • Narayanan Iyer's avatar
      [#550] Fix an incorrect assert related to user callback function returning YDB_TP_RESTART · 38a7a8ae
      Narayanan Iyer authored
      * In case the user callback function from a nested TP transaction returns `YDB_TP_RESTART`,
        the TP depth (STored in the variable `dollar_tlevel`) could be greater than 1 when we
        eventually get to restarting the TP transaction. In this case, an assert that was added
        in a recent commit (164054b1) failed as it was incorrect.
      
        This is fixed by moving the assert into the code block where we know the return code
        is not `YDB_TP_RESTART`.
      38a7a8ae
  20. 15 May, 2020 1 commit
  21. 08 May, 2020 1 commit
    • Brad Westhafer's avatar
      [#520] Fix bug introduced that previous commit where setting $ztrap doesn't set $ztrap · 5f6e449d
      Brad Westhafer authored
      The commit merged by !720 introduced a new bug where setting $ztrap didn't set $ztrap if the memory address where the mval $ztrap is supposed to be set to does not overlap in memory with the frame_pointer. The fix undoes the original #520 commit and replaces it with a new one that stores a copy of the string that $ztrap is to be set to in a buffer on the stack and then copies the buffer into the string pool before setting $ztrap correctly.
      5f6e449d
  22. 04 May, 2020 1 commit
    • Narayanan Iyer's avatar
      [DEBUG-ONLY] Bump initial condition handler stack size to avoid nested malloc issues · cd9a5a21
      Narayanan Iyer authored
      * We had an in-house test failure on an ARMV6L box with the following diff.
      
        ```diff
         > ideminter_rolrec_0/mupipstop_rollback_or_recover/impjob_imptp0.mje5
         > %YDB-F-ASSERT, Assert failed in sr_port/gtm_malloc_src.h line 695 for expression (FALSE)
         ```
      
        Below is the C-stack at the time of the assert failure.
      
        ```gdb
        #0  __pthread_kill () at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
        #1  gtm_dump_core () at sr_unix/gtm_dump_core.c:74
        #2  ch_cond_core () at sr_unix/ch_cond_core.c:77
        #3  rts_error_va () at sr_unix/rts_error.c:192
        #4  rts_error_csa () at sr_unix/rts_error.c:99
        #5  gtm_malloc () at sr_port/gtm_malloc_src.h:695
        #6  condstk_expand () at sr_unix/condstk_expand.c:53
        #7  ydb_stm_invoke_deferred_signal_handler () at sr_unix/ydb_stm_invoke_deferred_signal_handler.c:59
        #8  deferred_signal_handler () at sr_port/deferred_signal_handler.c:57
        #9  gtm_malloc () at sr_port/gtm_malloc_src.h:748
        #10 iorm_use () at sr_unix/iorm_use.c:988
        #11 iorm_open () at sr_unix/iorm_open.c:254
        #12 io_open_try () at sr_unix/io_open_try.c:616
        #13 op_open () at sr_port/op_open.c:160
        #14 open_source_file () at sr_unix/source_file.c:253
        #15 compiler_startup () at sr_port/compiler_startup.c:130
        #16 compile_source_file () at sr_unix/source_file.c:173
        #17 op_zcompile () at sr_port/op_zcompile.c:57
        #18 gtm_trigger_complink () at sr_unix/gtm_trigger.c:451
        #19 gtm_trigger () at sr_unix/gtm_trigger.c:551
        #20 gvtr_match_n_invoke () at sr_unix/gv_trigger.c:1683
        #21 gvcst_put2 () at sr_port/gvcst_put.c:2806
        #22 gvcst_put () at sr_port/gvcst_put.c:299
        #23 op_gvput () at sr_port/op_gvput.c:79
        #24 ydb_set_s () at sr_unix/ydb_set_s.c:137
        #25 ydb_set_st () at sr_unix/ydb_set_st.c:42
        #26 _cgo_d187034042ca_Cfunc_ydb_set_st () at cgo-gcc-prolog:170
        #27 runtime.asmcgocall () at /usr/lib/go-1.11/src/runtime/asm_arm.s:617
        ```
      
      * The cause of the assert failure is a nested call to `gtm_malloc()` (frames 9 and 5 above).
        And the reason that nested call happened is because the initial allocation of the condition handler
        stack size of 5 was not enough when `sr_unix/ydb_stm_invoke_deferred_signal_handler.c` tried to
        do an ESTABLISH and add one more condition handler (at frame number 7). This is because the
        condition handler stack was already used up with the following handlers.
      
        ```gdb
        (gdb) p chnd[0].ch
        $14 = (void (*)()) 0xb62f4f70 <stop_image_conditional_core>
        (gdb) p chnd[1].ch
        $15 = (void (*)()) 0xb63b0f10 <ydb_simpleapi_ch>
        (gdb) p chnd[2].ch
        $16 = (void (*)()) 0xb67f1e0c <gtm_trigger_complink_ch>
        (gdb) p chnd[3].ch
        $17 = (void (*)()) 0xb69c45e8 <source_ch>
        (gdb) p chnd[4].ch
        $18 = (void (*)()) 0xb6ce211c <compiler_ch>
        ```
      
      * The initial condition handler stack size (controlled by the `CONDSTK_INITIAL_INCR` macro) is currently
        set to 5 (last changed from 2 to 5 as part of GT.M V6.3-000) for DEBUG builds and set to 8 for
        PRO/Release builds.
      
      * Due to YottaDB's use of SimpleAPI, this limit of 5 is clearly not enough (as shown by the above failure)
        so it is now being bumped to 8 for DEBUG and to 16 for PRO/Release builds (just to be safe).
      cd9a5a21
  23. 02 May, 2020 1 commit
    • Narayanan Iyer's avatar
      Fix ICUSYMNOTFOUND error message if ydb_icu_version env var is set to a long string (SS_LOG2LONG) · 7aef327f
      Narayanan Iyer authored
      * The `v53003/D9I10002703` subtest in the YDBTest repo failed in in-house testing with the following
        diff when it was testing the `ydb_icu_version` env var set to a huge 32Kb sized string.
      
        ```diff
        30c30
        < %YDB-E-ICUSYMNOTFOUND, ... ICU needs to be built with symbol-renaming disabled or ydb_icu_version environment variable needs to be properly specified
        ---
        > %YDB-E-ICUSYMNOTFOUND, ... ICU needs to be built with symbol-renaming disabled or ydb_icu_version/gtm_icu_version environment variable needs to be properly specified
        ```
      
        The test was expecting the ICUSYMNOTFOUND error to indicate just the `ydb_icu_version` env var name
        as that was the env var which had been set to an incorrect value.
      
        But instead the actual output was `ydb_icu_verison/gtm_icu_version` which was incorrect.
      
      * A previous commit (df110a89) fixed the ICUSYMNOTFOUND error message
        to be deterministic in case the `ydb_icu_version` and `gtm_icu_version` env vars were both not
        defined (i.e. SS_NOLOGNAM return from trans_log_name()).
      
        But if one of more of these env vars were set to a very huge value (i.e. SS_LOG2LONG return from
        trans_log_name()), the prior commit did not handle that case correctly. It still considered neither
        of these env vars as defined and gave an ICUSYMNOTFOUND message listing both the env var names
        whereas only one of the env var names should have been printed in that case.
      
        For example, if `ydb_icu_version` env var was set to a huge string (32Kb in length), then the
        ICUSYMNOTFOUND error message would print `ydb_icu_version/gtm_icu_version` instead of just
        `ydb_icu_version`. This is incorrect.
      
        This is now fixed by setting the `is_ydb_env_match_usable` env var to TRUE as long as the return
        from trans_log_name() is not SS_NOLOGNAM (previously it was incorrectly set to TRUE only for the
        SS_NORMAL return case).
      
        With these fixes, the `v53003/D9I10002703` subtest now passes.
      7aef327f
  24. 01 May, 2020 1 commit
    • Narayanan Iyer's avatar
      ICUSYMNOTFOUND error message displays deterministic message if... · df110a89
      Narayanan Iyer authored
      ICUSYMNOTFOUND error message displays deterministic message if ydb_icu_version/gtm_icu_version env vars are not defined
      
      * Previously, the ICUSYMNOTFOUND error message would randomly display `ydb_icu_version` or
        `gtm_icu_version` as the env var name in the error message text in case neither of those
        two env var names were defined and this error is about to be issued. This is because the
        check that decides which env var name to use was relying on a variable `is_ydb_env_match`
        that was not initialized in the case neither env vars were set by the user. This is now
        fixed by a new variable `is_ydb_env_match_usable` that is set to FALSE in the case neither
        env vars are defined. In this case, the ICUSYMNOTFOUND now prints BOTH env var names.
      
      * Below is the message when `ydb_icu_version` env var is defined.
      
        ```
        %YDB-E-ICUSYMNOTFOUND, Symbol u_getVersion not found in the ICU libraries. ICU needs to be built with symbol-renaming disabled or ydb_icu_version environment variable needs to be properly specified
        ```
      
      * Below is the message when `ydb_icu_version` env var is not defined but `gtm_icu_version`
        env var is defined.
      
        ```
        %YDB-E-ICUSYMNOTFOUND, Symbol u_getVersion not found in the ICU libraries. ICU needs to be built with symbol-renaming disabled or gtm_icu_version environment variable needs to be properly specified
        ```
      
      * And below is the message when neither `ydb_icu_version` nor `gtm_icu_version` env vars
        are defined.
      
        ```
        %YDB-E-ICUSYMNOTFOUND, Symbol u_getVersion not found in the ICU libraries. ICU needs to be built with symbol-renaming disabled or ydb_icu_version/gtm_icu_version environment variable needs to be properly specified
        ```
      df110a89
  25. 30 Apr, 2020 1 commit
    • Narayanan Iyer's avatar
      [#576] $PIECE in a database trigger returns incorrect results when invoked... · aa63e2f7
      Narayanan Iyer authored
      [#576] $PIECE in a database trigger returns incorrect results when invoked from SimpleAPI (e.g. ydb_set_s)
      
      Issue description
      ------------------
      This is an issue that was noticed Octo (see YottaDB/DBMS/YDBOcto!583 (comment 333919051) for details
      on the failed pipeline jobs). Octo uses the SimpleAPI to do database updates and has triggers installed
      for those updates. A $PIECE invocation inside the M trigger code returned incorrect results. Below
      are the details of that particular failure.
      
      The M code that was invoked was the following.
      
      ```m
      SET newValue=$PIECE($ZTVALUE,$ZTDELIM,$ZTUPDATE)
      ```
      
      And below is the value of the involved variables at the time of the above $PIECE invocation.
      
      ```m
      $ZTVALUE="LOTSOFCOLS|2200|16388|0|16385|0|16386|0|0|0|0|16389|1|0|p|r|3|0|0|1|0|0|0|0|0|1|d|0|571|1||||803051"
      $ZTDELIM="|"
      $ZTUPDATE=1
      ```
      
      We therefore expect newValue to be set to LOTSOFCOLS. But instead it is set to LOTSOF (this in turn
      results in an incorrect update to the ^%ydboctoxref global and in turn results in an incorrect output
      of a pg catalog query and the TC013 subtest validation routine catches this and fails the test).
      
      This failure is seen only in Release builds. Never in Debug builds. Even with Release builds, this
      failure happens only rarely and only on some servers so it is most likely an uninitialized variable
      issue in the YDB code base.
      
      Fix description
      ----------------
      The issue turned out to be an uninitialized `fnpc_indx` field of the `mval` structure that `ydb_set_s()`
      passes as the value to set in the global variable node. This could end up having an arbitrary value and
      if it happens to be a non-zero value, then $PIECE could return incorrect results.
      
      This failure is not easy to reproduce since it requires the uninitialized value to be set to a non-zero
      value which seems to be an exception rather than the rule (we did not see this in internal testing on
      lots of platforms). Therefore it is not possible to come up with a simple automated test for this issue
      other than the Octo test case that detected this issue. If that Octo test case passes with this fix, it is
      considered enough confirmation of this code fix.
      aa63e2f7
  26. 28 Apr, 2020 2 commits
    • Narayanan Iyer's avatar
      [#567] MUPIP INTRPT in the middle of a HANG in a FOR loop causes fails... · 81f90d26
      Narayanan Iyer authored
      [#567] MUPIP INTRPT in the middle of a HANG in a FOR loop causes fails debug-build-only assert in sr_port/op_hang.c line 154
      
      * The primary issue was that if the hang timeout coming into `op_hang()` was less than 1 millisecond, we
        would invoke `rel_quant()` to do a yield and return (no actual sleep for that small timeout). But in that
        case we would still check the global `outofband` (which is TRUE if say a MUPIP INTRPT got delivered
        before the `rel_quant()`) and if that is TRUE we will push an entry into the M stack to say the HANG
        has to be redone for the remaining time (which there is none in the `rel_quant()` case) which is 0 msec.
        When op_hang() gets invoked again after the outofband action gets handled, if the M code had a random hang
        time specified, the new hang invocation would come in with a `num` value that is different from the time
        stored in the M stack (from the previous op_hang() invocation). That would fail the following assert.
      
        ```c
        assert(ms == mv_zintcmd->mv_st_cont.mvs_zintcmd.ms);
        ```
      
      * An automated test case for the assert failure will be added to the `r130/ydb567` subtest.
      
      * While the above assert is the only user-visible issue that we encountered, I realize there is a potential
        issue even in the pro/release/non-debug builds where an interrupted HANG 0 can result in an arbitrary hang.
        This is because in the `0 == ms` case, if `outofband` is noticed at the end of `op_hang()` we would note
        down the `end_or_remain` time in the M-stack from the `end_time` local variable. But this variable is
        uninitialized in the `0 == ms` case. So we could end up with an arbitrary hang when the next invocation
        of `op_hang()` retrieves the `end_or_remain` time from the M-stack and uses that to complete the
        interrupted hang. But I was not able to reproduce this failure by trying lots of HANG 0 and interrupting
        it with a MUPIP INTRPT. It requires a much more heavyweight application doing this than a simple test
        case. Am going to record this possibility in the issue release note even though I have not been able to
        reproduce it with a simple test case.
      81f90d26
    • Narayanan Iyer's avatar
      [#550] Correctly handle YDB_TP_ROLLBACK user callback function return value... · 164054b1
      Narayanan Iyer authored
      [#550] Correctly handle YDB_TP_ROLLBACK user callback function return value from nested transactions in ydb_tp_s()/ydb_tp_st()
      
      Note: Whenever ydb_tp_s() is referenced below, both ydb_tp_s() and/or ydb_tp_st() are implied.
      
      * If a user callback function invoked from ydb_tp_s() returned YDB_TP_ROLLBACK, the C API took care of
        rolling back the entire TP transaction if the callback function was invoked from the outermost
        ydb_tp_s() call. But for nested TP calls (i.e. an inner call to ydb_tp_s() while we are already
        inside an outer ydb_tp_s() call), the C API did not do a rollback of the nested TP transaction
        in case of a YDB_TP_ROLLBACK return from the user callback function. This confused the $TLEVEL
        maintenance logic inside YottaDB that resulted in the process being inside a transaction even after
        the outermost ydb_tp_s() call returned. That is, none of the updates of any outer level ydb_tp_s()
        calls got committed either (since $TLEVEL was still > 1) as long as at least one inner level
        ydb_tp_s() call returned YDB_TP_ROLLBACK. This is the issue.
      
        The fix is to ensure a rollback is done (using `OP_TROLLBACK(-1)`) of the current TP level
        in case of a YDB_TP_ROLLBACK return.
      
        In addition, the rollback is now done even for a return value that is not YDB_OK or YDB_TP_RESTART
        or YDB_TP_ROLLBACK.
      
        These changes are in sr_unix/ydb_tp_s_common.c (for the SimpleAPI/ydb_tp_s() case) and
        sr_unix/ydb_tp_st.c (for the SimpleThreadAPI/ydb_tp_st() case).
      
      * sr_unix/ydb_tp_s.c : Cosmetic cleanup (removed unused variables).
      164054b1
  27. 27 Apr, 2020 1 commit
    • Narayanan Iyer's avatar
      [#388] Fix ydb_lock_s() and ydb_lock_incr_s() regression where they would... · ffc82358
      Narayanan Iyer authored
      [#388] Fix ydb_lock_s() and ydb_lock_incr_s() regression where they would return premature YDB_LOCK_TIMEOUT for timed locks
      
      * In a prior #388 commit 1ccb6b3e, the code in the function
        `op_lock2()` was moved to `op_lock2_common()` so it can be called from M (the new `op_lock2()`)
        and the SimpleAPI (`ydb_lock_s()` and `ydb_lock_incr_s()`). But setting the global variable
        `out_of_time` to FALSE was still kept in the new `op_lock2()` instead of being moved over to
        `op_lock2_common()`. This means that `ydb_lock_s()` or `ydb_lock_incr_s()` would no longer set
        `out_of_time` to FALSE at the start like it used to before the above mentioned commit. And that
        means that if `out_of_time` is set to TRUE when we come into `ydb_lock_s()` or `ydb_lock_incr_s(),
        they would incorrectly return a `YDB_LOCK_TIMEOUT` status for a timed lock even though the timeout
        has not expired because most of the timeout detection code in `op_lock2_common()` relies on the
        `out_of_time` global variable.
      
      * This issue was noticed while analyzing a failure in the under construction `test_createtable/TC013`
        bats test case as part of fixing YottaDB/DBMS/YDBOcto#439. Below is the C-stack that resulted in
        setting the global variable `out_of_time` to TRUE (reading an M source file as part of compiling
        a trigger) and in turn exposed the premature `YDB_LOCK_TIMEOUT` bug described in the previous bullet.
      
        ```
        #0  iorm_readfl () at sr_unix/iorm_readfl.c:507
        #1  op_readfl () at sr_port/op_readfl.c:80
        #2  read_source_file () at sr_unix/source_file.c:290
        #3  compiler_startup () at sr_port/compiler_startup.c:160
        #4  compile_source_file () at sr_unix/source_file.c:173
        #5  op_zcompile () at sr_port/op_zcompile.c:57
        #6  gtm_trigger_complink () at sr_unix/gtm_trigger.c:451
        #7  gtm_trigger () at sr_unix/gtm_trigger.c:551
        #8  gvtr_match_n_invoke () at sr_unix/gv_trigger.c:1683
        #9  gvcst_kill2 () at sr_port/gvcst_kill.c:422
        #10 gvcst_kill () at sr_port/gvcst_kill.c:154
        #11 op_gvzwithdraw () at sr_port/op_gvzwithdraw.c:78
        #12 ydb_delete_s () at sr_unix/ydb_delete_s.c:152
        #13 delete_table_from_pg_class () at src/delete_table_from_pg_class.c:63
        #14 run_query () at src/run_query.c:320
        #15 main () at src/octo.c:78
        ```
      
      * A comment is added to the `YDB_LITERAL_TO_BUFFER` macro to better describe how it is different from
        the almost similar `YDB_STRING_TO_BUFFER` macro (difference is for literals with null bytes embedded
        in them).
      ffc82358