Easier use of journal extract data for troubleshooting and forensics
Final Release Note
%YDBJNLF is new functionality to simplify analytics and forensics using journal files.
%YDBJNLF is available in r1.36 as field-test software. As we normally strive to ensure upward compatibility of functionality from release to release, the field-test designation indicates that, based on user input, we may improve the API in a way that is not upward compatible. Please use %YDBJNLF and give us feedback.
Overview
Since journal files capture database state changes, they are invaluable for troubleshooting and forensics.
While MUPIP JOURNAL EXTRACT provides that information in a human readable form, given the large volume of data in even a single journal file from a production system, automation facilitates analysis. The %YDBJNLF utility routine loads journal extracts into global variables, allowing software to answer questions such as which process(es) updated a certain global, in what sequence and when; that global variable updates a process made; etc.
Using the Octo SQL Engine makes it even easier to query journal records ingested by %YDBJNLF. This further enables those who are not software developers or system administrators, such as business analysts and auditors, to query and analyze the data. For example:
OCTO> select op,horologdate,horologtime,pid from YDBJNLFTYPE1 where gvref = '^x(5)' order by horologdate,horologtime;
OP|HOROLOGDATE|HOROLOGTIME|PID
SET|66385|59173|8937
KILL|66386|54903|20312
SET|66386|55166|20312
(3 rows)
OCTO>
Ingestion
INGEST^%YDBJNLF(jnlfile[,label]) uses MUPIP JOURNAL EXTRACT FORWARD SHOW=ALL FENCES=NONE DETAIL FULL NOVERIFY to extract journal file jnlfile into global variables as described below. Since troubleshooting and forensics may need damaged journal files to be ingested, %YDBJNLF uses the NOVERIFY option.
- If
label
is specified, it is used to identify the extract; otherwise the journal file namejnlfile
is the identifying label. - INGEST deletes any existing
^%ydbJNLF*(label)
global variables. Use a unique label for each call to INGEST if the journal file name is not unique, e.g., current generation journal files likeyottadb.mjl
, because they will be renamed when the journal file is switched, and ingesting a subsequent file with same name will replace previously ingested data. For prior generation journal files, e.g.,yottadb.mjl_2022341165323
using a label is less important as the file name is permanent. If you are ingesting a journal file from another machine, e.g., on a machine designated for forensics, then a label is required to distinguish journal files from different machines.
Global Variables
Global variables store both journal extract records as well as metadata. The general format of global variables used to store journal extract records is:
- a prefix of
^%ydbJNLF
and a suffix of the record type, as discussed below, e.g.,^%ydbJNLFTYPE1
; - a first subscript, which is the label;
- a second subscript, which is the offset in the journal file at which the record begins; and
- a third subscript, which is the size of the record, in bytes.
Since journal files are written sequentially, the offset states definitively whether, for example, a SET was executed before or after a KILL.
The data at each node is a "\"
separated record, as documented in MUPIP JOURNAL EXTRACT formats with the following changes to expedite analysis:
- The first piece, preceding the record, is the record type, KILL, PBLK, PINI, SET, etc.
- The $HOROLOG timestamp of the record is stored in two separate pieces, date and time.
-
node
fields in extract records are stored as two pieces, the global variable (e.g.,^x
) and the complete global reference (e.g.,^x(2)
). -
node=sarg
fields in extract records are stored as three pieces, the global variable, the global reference, and the value. So^x(2)=ABC
woudl be stored as…\^x\^x(2)\"ABC"
. - Extracts for SET and KILL type records are all stored in the same
^%ydbJNLFTYPE1
global variable, except that the KILL type records do not have the last piece, which stores the value for SET type records.
For example, the MUPIP JOURNAL EXTRACT format for SET records is:
time\tnum\chksum\pid\clntpid\token_seq\strm_num\strm_seq\updnum\nodeflags\node=sarg
^%ydbJNLFTYPE1 nodes store SET records as
op\horologdate\horologtime\tnum\chksum\pid\clntpid\token_seq\strm_num\strm_seq\updnum\nodeflags\gvname\gvref\nodeval
where op
is SET
.
Metadata for the journal extract data is stored in global variables as follows:
-
^%ydbJNLF
, with records:-
^%ydbJNLF(label,"extrfmt")
contains the format of the journal extract, e.g.,"YDBJDX09 UTF-8"
. -
^%ydbJNLF(label)
contains metadata about the extract, in semicolon (;
) separated pieces, with strings quoted. Fields of interest are:- The journal file name, e.g.,
"/tmp/test/r1.36_x86_64/g/yottadb.mjl"
. - The journal file format, e.g.,
"YDBJNL44"
. - The database file name, e.g.,
"/tmp/test/r1.36_x86_64/g/yottadb.dat"
. - The prior generation journal file name, if any, e.g.,
"/tmp/test/r1.36_x86_64/g/yottadb.mjl_2022277114311"
. - A subsequent journal file name if any. This field is typically blank, except for journal files involved in a MUPIP JOURNAL BACKWARD RECOVER/ROLLBACK.
- If before-image journaling is enabled, the string
"ENABLED"
. - The remaining fields are primarily of interest to system administrators. Consult your YottaDB support channel or the source code for more information.
- The journal file name, e.g.,
-
-
^%ydbJNLFACTIVE(label,counter)
has records of processes that had the journal file open at the time of the EXTRACT, where counter is simply an incrementing integer. Semicolon separated pieces are as follows:- The process pid.
- The node name of the computer, e.g.,
mylaptop
. - The user name of the process, e.g.,
ydbuser
. - The terminal or pseudo-terminal of the session, if any, e.g.,
3
. - The $HOROLOG date when the process opened the journal file, e.g.,
66386
. - The $HOROLOG time of day when the process opened the journal file, e.g.,
42207
.
-
^%ydbJNLFCOMPLETE(label,counter)
has records of processes that previously had the journal file open, but no longer do. The fields are the same as for^%ydbJNLFACTIVE
, except that the pid may not be unique, in the event the operating system recycled pids while the journal file was active. -
^%ydbJNLFOPCOUNT(label,op)
has count of each opcode, e.g., if label is"TEST1"
, the node^%ydbJNLFOPCOUNT("TEST1","SET")
reports the number of SET records ingested. -
Since the number of record formats is smaller than the number of opcodes, e.g., all SET and KILL opcodes use the same record type,
^%ydbJNLFRECTBL(label,op)
specifies the record type for each opode. For example,^%ydbJNLFRECTBL("TEST1","SET")="TYPE1"
means that global variables nodes that store ingested SET records are^%ydbJNLFTYPE1("TEST1",offset,recsize)
.
Octo DDL
OCTODDL^%YDBJNLF([rectype])
outputs CREATE TABLE statements in a format suitable for Octo, which then allows the journal file data identified by label to be queried through Octo using SQL.
If rectype is omitted, OCTODDL, outputs CREATE TABLE statements for all record types and all metadata, e.g.,
YDB>do OCTODDL^%YDBJNLF
DROP TABLE IF EXISTS YDBJNLF KEEPDATA;
CREATE TABLE YDBJNLF -- Metadata for ingested journal files
(label VARCHAR,
jnlfilename VARCHAR, -- Journal file name
…
openprocjpvtime INTEGER,
PRIMARY KEY (label))
Delim ";"
Global "^%ydbJNLF";
…
DROP TABLE IF EXISTS YDBJNLFRECTYPE KEEPDATA;
CREATE TABLE YDBJNLFRECTYPE -- Table for record types
(label VARCHAR,
rectype VARCHAR,
tbl VARCHAR,
PRIMARY KEY (label, rectype))
GLOBAL "^%ydbJNLFRECTBL";
YDB>
If rectype
is one or more opcodes, OCTODDL outputs the CREATE TABLE statements for those record types, e.g.,
YDB>do OCTODDL^%YDBJNLF("SET,AIMG,KILL")
DROP TABLE IF EXISTS YDBJNLFTYPE1 KEEPDATA;
CREATE TABLE YDBJNLFTYPE1 -- FKILL,FSET,FZKILL,GKILL,GSET,GZKILL,KILL,SET,TKILL,TSET,TZKILL,TZTRIG,UKILL,USET,UZKILL,UZTRIG,ZKILL,ZTRIG
(label VARCHAR,
offset INTEGER,
recsize INTEGER,
op VARCHAR,
horologdate INTEGER,
horologtime INTEGER,
tnum INTEGER,
chksum INTEGER,
pid INTEGER,
clntpid INTEGER,
token_seq INTEGER,
strm_num INTEGER,
strm_seq INTEGER,
updnum INTEGER,
nodeflags INTEGER,
gvname VARCHAR,
gvref VARCHAR,
nodeval VARCHAR,
PRIMARY KEY (label, offset, recsize))
Delim "\"
Global "^%ydbJNLFTYPE1";
DROP TABLE IF EXISTS YDBJNLFTYPE5 KEEPDATA;
CREATE TABLE YDBJNLFTYPE5 -- AIMG,PBLK
(label VARCHAR,
offset INTEGER,
recsize INTEGER,
op VARCHAR,
horologdate INTEGER,
horologtime INTEGER,
tnum INTEGER,
chksum INTEGER,
pid INTEGER,
clntpid INTEGER,
blknum INTEGER,
bsiz INTEGER,
blkhdrtn INTEGER,
ondskbver INTEGER,
dsecmdline VARCHAR,
PRIMARY KEY (label, offset, recsize))
Delim "\"
Global "^%ydbJNLFTYPE5";
YDB>
Purging Data
PURGE^%YDBJNLF(label)
purges all ingested %YDBJNLF data with the specified label
. If label
is omitted, it purges all ingested %YDBJNLF data.
Operational
If a YDBJNLF region does not exist in the global directory, INGEST creates an AutoDB region that uses the MM access method, mapped to database file %ydbjnlf.dat
in the same directory as the DEFAULT region ($ydb_dir/$ydb_rel/g
if the global directory was created by sourcing ydb_env_set
). Global variables of the form ^%ydbJNLF
with all case combinations of JNLF
are mapped to the YDBJNLF region.
To avoid impacting response times in production environments, we recommend analyzing elsewhere the large journal files generated by production environments.
Although we have attempted to make INGEST as efficient as we can, ingesting large journal files is inherently not a fast operation and the time taken will be comparable to the time taken for the underlying MUPIP JOURNAL EXTRACT operation.
When ingesting journal files on systems where the corresponding database file is not present:
- If the database file does not use custom collation, set the ydb_extract_nocol environment variable.
- If the database file does use custom collation, create an empty database file with the same filename as the original database, and with the same custom collation.
Description
Draft Release Note
%YDBJNLF is new functionality to simplify analytics and forensics based on journal files.
%YDBJNLF is made available in r1.36 as field-test software. As we normally strive to ensure upward compatibility of our releases, the field-test designation indicates that we may improve the API in a way that is not upward compatible, based on user input. Please use %YDBJNLF and give us feedback.
Overview
Since journal files capture database state changes, they are invaluable for troubleshooting and forensics.
While MUPIP JOURNAL EXTRACT provides that information in a human-readable form, the large volume of data in even a single journal file from a production system speaks to the utility of a database to analyze the data. The %YDBJNLF utility routine loads journal extracts into global variables from where state changes can be analyzed by software to answer questions such as which process(es) updated a certain global variable, in what sequence, and when; what global variable updates a process made; etc.
While the Octo SQL Engine is not required to analyze the data, it can be used to query the journal records ingested by %YDBJNLF. This enables those who are not software developers or system administrators, such as business analysts and auditors, to analyze the data. For example:
OCTO> select op,horologdate,horologtime,pid from YDBJNLFTYPE1 where gvref = '^x(5)' order by horologdate,horologtime;
OP|HOROLOGDATE|HOROLOGTIME|PID
SET|66385|59173|8937
KILL|66386|54903|20312
SET|66386|55166|20312
(3 rows)
OCTO>
Ingestion
INGEST^%YDBJNLF(jnlfile,[label])
uses MUPIP JOURNAL EXTRACT FORWARD SHOW=ALL FENCES=NONE DETAIL FULL to extract the journal file jnlfile
into global variables as described below. INGEST uses the NOVERIFY option of MUPIP JOURNAL EXTRACT, since troubleshooting and forensics may require damaged journal files to be ingested.
- If
label
is specified, it is used to identify the extract; otherwise the journal file namejnlfile
is the identifying label. - INGEST deletes all existing
^%ydbJNLF*(label)
global variables. Use a uniquelabel
for each call to INGEST if the journal file name is not unique. As prior generation journal files have unique names with timestamps, they can be safely loaded without specifying alabel
. However, as current journal files for a region have the same name, ingesting them should havelabel
specified to avoid replacing existing data unless the exact same file is being ingested at a later time. Since journal files from different instances can have the same name, on systems where data from different systems may be analyzed, you should always specifylabel
.
Global Variables
Global variables store both journal extract records as well as metadata. The general format of global variables used to store journal extract records is:
- a prefix of
^%ydbJNLF
and a suffix of the record type as specified below, e.g.,^%ydbJNLFTYPE1
; - a first subscript, which is the
label
; - a second subscript, which is the offset in the journal file in which that record begins; and
- a third subscript which is the record size of the record.
Since journal files are written sequentially, the offset states definitively whether, for example, a SET happened before or after a KILL.
The data at each node is a "\"
separated record as documented in MUPIP JOURNAL EXTRACT formats for DETAIL journal extracts, with the following changes to expedite analysis:
- The first piece, preceding the record, is the record type, KILL, PBLK, PINI, SET, etc.
- The timestamp in $HOROLOG format is stored as separate pieces, the date and time.
-
node
fields in extract records are stored as two pieces, just the global variable (e.g.,^x
), and the global reference, e.g.,^x(2)
. -
node=sarg
fields in extract records are stored as three pieces, the global variable, the global reference, and the value. So^x(2)="ABC"
would be stored as…\^x\^x(2)\""ABC""
. - Extracts for SET and KILL type records are all stored in the same
^%ydbJNLFTYPE1
global variable, except that the KILL type records do not have that last piece, which stores the value for SET type records.
For example, the MUPIP JOURNAL EXTRACT format for SET records is:
time\tnum\chksum\pid\clntpid\token_seq\strm_num\strm_seq\updnum\nodeflags\node=sarg
^%ydbJNLFTYPE1
nodes store SET records as
op\horologdate\horologtime\tnum\chksum\pid\clntpid\token_seq\strm_num\strm_seq\updnum\nodeflags\gvname\gvref\nodeval
where op
is "SET"
.
Metadata for the journal extract data is stored in global variables as follows:
-
^%ydbJNLF
, with records:-
^%ydbJNLF(label,"extrfmt")
contains the format of the journal extract, e.g.,"YDBJDX09 UTF-8"
-
^%ydbJNLF(label)
contains metadata about the extract, in semicolon separated pieces, with strings quoted. Fields of interest are:- The journal file name, e.g.
"/tmp/test/r1.35_x86_64/g/yottadb.mjl"
. - The journal file format, e.g.,
"YDBJNL44"
. - The database file name, e.g.,
"/tmp/test/r1.35_x86_64/g/yottadb.dat"
. - The prior generation journal file name, if any, e.g.,
"/tmp/test/r1.35_x86_64/g/yottadb.mjl_2022277114311"
. - The subsequent journal file name if any. In this example, where the journal file is the current journal file, there would be no subsequent journal file.
- If before-image journaling is enabled, the string
"ENABLED"
. - The remaining fields are primarily of interest to system administrators. Contact your YottaDB support channel or the source code for more information.
- The journal file name, e.g.
-
-
^%ydbJNLFACTIVE(label,seqno)
has records of processes that had the journal file open at the time of the EXTRACT, whereseqno
is simply an incrementing integer. Semicolon separated pieces are as follows:- The process pid.
- The node name of the computer, e.g.,
mylaptop
. - The user name of the process, e.g.,
ydbuser
. - The terminal or pseudo-terminal of the session, if any, e.g.,
3
. - The $HOROLOG date when the process opened the journal file, e.g.,
66386
. - The $HOROLOG time of day when the process opened the journal file, e.g.,
42207
.
-
^%ydbJNLFCOMPLETE(label,seqno)
has records of processes that previously had the journal file open, but no longer do. The fields are the same as for^%ydbJNLFACTIVE
, except that the pid may not be unique, in the event the operating system recycled pids while the journal file was active. -
^%ydbJNLFOPCOUNT(label,op)
has count of each opcode, e.g., iflabel
is"TEST1"
, the node^%ydbJNLFOPCOUNT("TEST1","SET")
reports the number of SET records ingested. -
Since the number of record formats is smaller than the number of opcodes, e.g., all SET and KILL opcodes use the same record type,
^%ydbJNLFRECTBL(label,op)
specifies the record type for each opode. For example,^%ydbJNLFRECTBL("TEST1","SET")="TYPE1"
means that global variables nodes that store ingested SET records are^%ydbJNLFTYPE("TEST1",offset,recsize)
.
Octo DDL
OCTODDL^%YDBJNLF
outputs CREATE TABLE statements in a format suitable for Octo, which then allows the journal file data identified by label
to be queried through Octo using SQL, e.g.,
YDB>do OCTODDL^%YDBJNLF
DROP TABLE IF EXISTS YDBJNLF KEEPDATA;
CREATE TABLE YDBJNLF -- Metadata for ingested journal files
(label VARCHAR,
jnlfilename VARCHAR, -- Journal file name
…
openprocjpvtime INTEGER,
PRIMARY KEY (label))
Delim ";"
Global "^%ydbJNLF";
…
DROP TABLE IF EXISTS YDBJNLFRECTYPE KEEPDATA;
CREATE TABLE YDBJNLFRECTYPE -- Table for record types
(label VARCHAR,
rectype VARCHAR,
tbl VARCHAR,
PRIMARY KEY (label, rectype))
GLOBAL "^%ydbJNLFRECTBL";
YDB>
Purging data
PURGE^%YDBJNLF(label)
purges all ingested %YDBJNLF data with the specified label
. If label is omitted, it purges all ingested %YDBJNLF data.
Operational
If a YDBJNLF region does not exist, INGEST creates a YDBJNLF AutoDB region similar to the YDBAIM region, mapped to a %ydbjnlf.dat
database file in the same directory as the DEFAULT region, with ^%ydbjnlf*
global variables with all case combinations of jnlf
mapped to the region. Since data in the YDBJNLF region is entirely recoverable from the original data, the default access method for the unjournaled YDBJNLF region is MM, matching the default for the YDBAIM region. INGEST loads the updated global directory, and if called as a function, returns 1 if it did so, and 0 otherwise.
If INGEST creates a new YDBJNLF region, although the process that executes INGEST will update its copy of the global directory, other processes, for example a Rocto process, accessing the database using that global directory will need to be restarted.
To avoid impacting response times in production environments, we recommend analyzing the large journal files generated by production environments elsewhere.
Although we have attempted to make INGEST as efficient as we can, ingesting large journal files is inherently not a fast operation and the time taken will be comparable to the time taken foe the underlying MUPIP JOURNAL EXTRACT operation.