TPRESTART syslog messages identify correct global name and message type
Final Release Note
Type 0 and type 3 TPRESTART messages in the syslog (enabled by turning on TP restart logging) correctly report the global variable causing the restart. Furthermore, type 3 messages correctly identify themselves as type 3 messages. Previously, they could report an incorrect global variable reference, and type 3 messages were sometimes incorrectly identifed as type 2. (#207 (closed))
Description
Below is a test case that demonstrates the issue.
> setenv ydb_gbldir mumps.gld
> rm -f mumps.gld
> $ydb_dist/mumps -run GDE @tmp.com
> rm *.dat
> $ydb_dist/mupip create
> setenv gtm_tprestart_log_delta 1
> $ydb_dist/mumps -run x
If you watch the syslog while the above command runs, you will see good and bad messages like demonstrated below.
Good message --> ^a is global name and database is a.dat which match Apr 24 11:01:57 xxxxx YDB-MUMPS[2634]: %YDB-I-TPRESTART, Database tmp/a.dat; code: L; blk: 0x00000003 in glbl: ^a(1979); pvtmods: 0, blkmods: 1, blklvl: 0, type: 4, readset: 1, writeset: 0, local_tn: 0x0000000000000294, zpos: child+4^x
Bad message --> ^b is global name and database is a.dat which do not match Apr 24 11:01:57 xxxxx YDB-MUMPS[2638]: %YDB-I-TPRESTART, Database tmp/a.dat; code: LL; blk: 0x00000003 in glbl: ^b(1981); pvtmods: 1, blkmods: 1, blklvl: 0, type: 3, readset: 2, writeset: 1, local_tn: 0x000000000000029A, zpos: child+5^x
Below are the files needed to run the above test
> cat tmp.com
change -segment DEFAULT -file=mumps.dat
add -name a -region=areg
add -region areg -dyn=aseg
add -segment aseg -file=a
add -name b -region=breg
add -region breg -dyn=bseg
add -segment bseg -file=b
add -name c -region=creg
add -region creg -dyn=cseg
add -segment cseg -file=c
> cat x.m
x
set ^stop=0,njobs=5
for i=1:1:njobs do
. set jobstr="job child^x:(output=""child_x.mjo"_i_""":error=""child_x.mje"_i_""")"
. xecute jobstr
. set job(i)=$zjob
;
; let child run for 10 seconds
hang 10
;
; signal child processes to stop
set ^stop=1
;
; wait for child processes to die
for i=1:1:njobs set pid=job(i) for quit:'$zgetjpi(pid,"ISPROCALIVE") hang 0.1
quit
child ;
for i=1:1:1000 do
. tstart ():serial
. set x=$incr(^c)
. set ^a(x)=1,^b(x)=2
. tcommit
quit
Draft Release Note
The TPRESTART message (which is recorded in the syslog when restarts happen in a TP transaction and the environment variable ydb_tprestart_log_delta/gtm_tprestart_log_delta is set to a non-zero value) indicates the correct global name that was involved in the restart. In prior versions, this could be incorrect some times (only possible when the TPRESTART syslog message had a "type:" of 0 or 3).