YDBSyslog – Capture Syslog data in a Database for Analytics, Troubleshooting and Forensics
YDBSyslog is a tool to capture syslog data in a YottaDB database using the journalctl --output=export
format. It operates in two modes:
- Running
journalctl
in a PIPE device. With the optional--follow
option, YDBSyslog continuously monitorsjournalctl
output and captures the output in real time. - Reading a
journalctl
export from stdin. Reading fromjournalctl --output=export --follow
in a pipe is effectively the same as reading from a PIPE device with the--follow
option.
YDBSyslog can output a DDL which can be fed to Octo, allow the syslog to be queried using SQL.
Usage
yottadb -run %YDBSYSLOG op [options]
Op [options]` are
-
help
- Output options to use this program. -
ingestjnlctlcmd [options]
- Run thejournalctl --output=export
command in a PIPE. Options are as follows; all options may be omitted.--boot [value]
---boot
is mutually exclusive with--follow
. There are several cases:- If
value
is omitted, the--boot
parameter is omitted when invokingjournalctl
. - If
value
is a hex string prefixed with0x
, the string sans prefix is passed tojournalctl --boot
. - If a decimal number, it is passed unaltered to
journalctl --boot
. - If a case-independent
all
, that option is passed tojournalctl --boot
.--follow
is mutually exclusive with--boot
. The--follow
option is used to invokejournalctl --follow
, and results in a continuous capture in the database of the syslog exported byjournalctl
.--moreopt
indicates that the rest of the command line should be passed verbatim to thejournalctl
command as additional options. See the Linux commandman journalctl
for details. YDBSyslog does no error checking of these additional options.
- If
-
ingestjnlctlfile
– readjournalctl --output=export
formatted data from stdin. -
octoddl
- output an Octo DDL to allow analysis of syslog data using SQL. Note that if the database combines syslog data from multiple systems, Octo SQL queries can span systems.
The following M entryrefs can called directly from applications written in M and other programming languages that support calls to M.
-
INGESTJNLCTLCMD(boot,follow,moreopt)
runsjournalctl --output=export
in a PIPE device. Parameters are:-
boot
is the parameter for the--boot
command line option ofjournalctl
. There are several cases:- If unspecified or the empty string, the
--boot
option is omitted. - If a hex string prefixed with
"0x"
, the string sans prefix is passed tojournalctl
as the value. - If a decimal number, it is passed unaltered to
journalctl
. - If a case-independent
"all"
, that option is passed tojournalctl
.
- If unspecified or the empty string, the
- If
follow
is non-zero, INGESTJNLCTLCMD follows journalctl, continuously logging syslog output in the database.boot
andfollow
are mutuially exclusive. -
moreopt
is a string intended to be passed verbatim to the journalctl command. See the Linux commandman journalctl
for details. INGESTJNLCTMCMD does no error checking of these additional options.
-
-
INGESTJNLCTLFILE
readsjnlctl --output=export
formatted data from stdin. -
OCTODDL
generates the DDL that can be fed to Octo to query the ingested syslog data using SQL.
Data are stored in nodes of ^%ydbSYSLOG
with the following subscripts, which are reverse engineered from the __CURSOR
field of the journalctl
export format. While __CURSOR
is designated as opaque, reverse engineering provides a more compact database and faster access:
-
Cs
– a UUID for a large number of syslog records. -
Cb
– evidently a boot UUID. -
Ci
- evidently the record number in a syslog. -
Ct
- evidently the number of microseconds since the UNIX epoch. -
Cm
– evidently a monolithic timestamp since boot. -
Cx
- a UUID that is unique to each syslog entry.
Fields that journalctl
has been found to flag as binary, e.g., "MESSAGE"
and "SYSLOG_RAW"
have an additional, sixth, subscript, the tag for the field.
Note that since querying syslog entries is content based (e.g., the USER_ID field) and not by the subscripts, if the reverse engineering of __CURSOR
is imperfect, or if a future systemd-journald
changes the fields, it will not affect the correctness of queries; it will only affect database size and consequently access speed (smaller databases are faster).
The numerous fields exported by journalctl
are not well documented. Systemd Journal Export Formats is helpful, as is man systemd.journal-fields. However, outside the source code, there does not appear to be a comprehensive list of all fields. The fields listed in the _YDBSYSLOG.m
source code were captured from a couple dozen Linux systems running releases and derivatives of Arch Linux, Debian GNU/Linux, Red Hat Enterprise Linux, SUSE Linux Enterprise, and Ubuntu. Even if journalctl
exports additional fields not identified, %YDBSYSLOG captures them, and generates reasonable DDL entries for them.
Should you find additional entries not identified by the _YDBSYSLOG.m
source code, please create an Issue or a Merge Request.
Installation
Since this is a plug-in for YottaDB, YottaDB must be installed first.
YDBSyslog requires YottaDB r1.36 or higher.
To install, you need cmake
, make
, cc
, and ld
commands. After
downloading this repository, you can install as follows:
cd <project directory>
mkdir build && cd build
cmake .. && make && sudo make install
Here is a sample installation:
$ cmake ..
-- YDBCMake Source Directory: /home/ydbuser/work/gitlab/YDBSyslog/build/_deps/ydbcmake-src
-- Setting locale to C.UTF-8
-- Found YOTTADB: /usr/local/lib/yottadb/r138/libyottadb.so
-- Install Location: /usr/local/lib/yottadb/r138/plugin
-- Configuring done (1.0s)
-- Generating done (0.0s)
-- Build files have been written to: /home/ydbuser/work/gitlab/YDBSyslog/build
$ make
[ 25%] Building M object CMakeFiles/_ydbsyslogM.dir/_YDBSYSLOG.m.o
[ 50%] Linking M shared library _ydbsyslog.so
[ 50%] Built target _ydbsyslogM
[ 75%] Building M object CMakeFiles/_ydbsyslogutf8.dir/_YDBSYSLOG.m.o
[100%] Linking M shared library utf8/_ydbsyslog.so
[100%] Built target _ydbsyslogutf8
$ sudo make install
[ 50%] Built target _ydbsyslogM
[100%] Built target _ydbsyslogutf8
Install the project...
-- Install configuration: ""
-- Installing: /usr/local/lib/yottadb/r138/plugin/o/_ydbsyslog.so
-- Installing: /usr/local/lib/yottadb/r138/plugin/o/utf8/_ydbsyslog.so
-- Up-to-date: /usr/local/lib/yottadb/r138/plugin/r/_YDBSYSLOG.m
$
Example
$ yottadb -run %YDBSYSLOG ingestjnlctlcmd --boot all # Get syslogs from mylaptop
$ yottadb -run %YDBSYSLOG ingestjnlctlfile </extra/tmp/journalctl.export # Get exported journalctl from my server
$ yottadb -run %YDBSYSLOG octoddl | octo # Define TABLE in Octo
DROP TABLE
CREATE TABLE
$ echo "select _HOSTNAME, _COMM, MESSAGE from SYSLOG_DATA where _UID = 2261 and ( _COMM = 'mupip' or _COMM = 'yottadb' or _COMM = 'mumps' ) limit 2;" | octo
_HOSTNAME|_COMM|MESSAGE
mylaptop|mupip|%YDB-I-FILERENAME, File /tmp/test/r1.38_x86_64/g/yottadb.mjl is renamed to /tmp/test/r1.38_x86_64/g/yottadb.mjl_2023180120821 -- generated from 0x00007F5CDDC4B36E.
mylaptop|mupip|%YDB-I-FILERENAME, File /tmp/test/r1.38_x86_64/g/%ydbocto.mjl is renamed to /tmp/test/r1.38_x86_64/g/%ydbocto.mjl_2023180120821 -- generated from 0x00007F5CDDC4B36E.
(2 rows)
$ echo "select _HOSTNAME, _COMM, MESSAGE from SYSLOG_DATA where _UID = 4528 and ( _COMM = 'mupip' or _COMM = 'yottadb' or _COMM = 'mumps' ) limit 2;" | octo
_HOSTNAME|_COMM|MESSAGE
myserver|mumps|%YDB-I-TEXT, %YDB-E-ZGBLDIRACC, Cannot access global directory /extra1/testarea1/testsys/tst_V989_R139_dbg_17_230707_173250/sudo_0/gtm7759/mumps.gld. Cannot continue. -- generated from 0x00007FCCF1244143.
myserver|mumps|%YDB-I-TEXT, %YDB-E-DBPRIVERR, No privilege for attempted update operation for file: /extra1/testarea1/testsys/tst_V989_R139_dbg_17_230707_173250/sudo_0/gtm7759/mumps.dat -- generated from 0x00007FEE9C112143.
(2 rows)
$
License
See both the COPYING and LICENSE files.