Postgres mtail monitoring wrong files.
I got an alert for a extremely large number of Postgres "statement timeout" errors.
In the investigation I found that a huge number of mtail log lines were found, but this did not correlate with the postgresql.csv log file that these errors were supposedly coming from.
When looking at the mtail per-logfile metrics, it seems like mtail found and re-read the contents of /var/log/gitlab/postgresql/postgresql.csv
and attributed the contents to /var/log/gitlab/postgresql/current
, which doesn't seem to exist on our system.
In order to mitigate this, we can do several things:
- Drop the
current
log file from mtail's watching. This file doesn't seem to exist. - Use a
getfilename()
stop in the mtail program to tie postgres programs to specific files.
Edited by Ben Kochie