Grok Maker is an application that semi-automates the development and deployment of Elasticsearch-groks and regular expressions to identify, parse, and extract content from classes of log messages for ingestion into an Elasticsearch (ES) instance. From an initial collection of log messages it will automatically generate groks that identify and extract relevant data from all classes of log messages. The generated groks can be refined by labeling the information extracted. The application also manages the deployment of the groks to fluentd instances that push the extracted content into an ES instance. An optional fluentd plugin does a real time analysis of the class of log messages received that statistically identifies messages occurring with an unusually high frequency.
Grok-Maker: A Grok Development and Deployment Application
An accurate and complete analysis of all the server and application log messages a cloud system can generate is crucial for monitoring its health and diagnosing problems with it. But creating, testing, and deploying the configurations that identity and parse the relevant information from the log messages is labor intensive and prone to errors. Even a small cloud can generate billions of log messages a day comprised of tens of thousands of unique classes of log messages. System administrators rarely know the structure of all the log messages a process can generate, so they create parsers for the classes generated by the processes up to some point in time. When a process generates a new class of messages they need to identify it and write a new parser for it. Grok-Maker is an application to streamline this entire process.
It is an application that semi-automates the development and deployment of Elasticsearch-groks and regular expressions to identify, parse, and extract content from classes of log messages for ingestion into an Elasticsearch (ES) instance. Note groks are an enhanced form of regular expressions. From an initial collection of log messages the application will automatically generate groks that identify and extract relevant data from all classes of log messages. The generated groks can be refined by labeling the information extracted. The application also manages the deployment of the grok configurations to fluentd agents. An optional fluentd plugin does a real time analysis of the class of log messages received that statistically identifies messages occurring with an unusual frequency.
This talk will outline the motivation for the development of the application, the algorithms used to automatically generate the groks to parse a set of log messages, and the overall architecture of the application.
This talk will outline the motivation for the development of the application, the algorithm used to automatically generate the groks to parse a set of log messages, and the overall architecture of the application.