Elasticsearch query visualizer
In the context of Advanced Search, we are using Elasticsearch as our storage engine. Thus, we need to build Elasticsearch queries to fetch the result set that matches the terms/filters our users submits.
One of the takeaway is that Elasticsearch queries are represented as JSON, which itself tends to be pretty verbose and hard to read. Furthermore, Elasticsearch defines its own sets of operators that one can use when querying for data.
The Problem
Queries are built incrementally and can grow in size to the point where it is very hard to understand, as there are multiple layers of access control that needs to be taken into account.
Sample query `test123` on gitlab-org/gitlab
{
"query": {
"bool": {
"must": [
{
"simple_query_string": {
"fields": [
"title^2",
"description"
],
"query": "test 123",
"default_operator": "and"
}
}
],
"filter": [
{
"term": {
"type": "merge_request"
}
},
{
"has_parent": {
"parent_type": "project",
"query": {
"bool": {
"should": [
{
"bool": {
"filter": [
{
"terms": {
"id": [
278964
]
}
},
{
"terms": {
"merge_requests_access_level": [
20,
10
]
}
}
]
}
}
]
}
}
}
}
]
}
},
"highlight": {
"fields": {
"title": {},
"description": {}
}
}
}
Solutions
- Use Elasticsearch Named Queries to explain each part of the query
- Find or build a tool to visualize Elasticsearch queries
For instance, we could use a dendogram to visualize the query, using the standard logic gates symbols or annotations.
Iteration path
Query visualizer
First, this tool should be usable with a unmodified query payload sent to Elasticsearch, such as that a user can copy/paste the query in the tool.
The tool should then parse the query and represent it in a visual format.
Query introspection
Some of the features that would be useful for manipulating the query:
- Collapsing parts of the query
- Having an interactive breadcrumb when hovering any node
Query debugger
The tool could also parse the output of the Elasticsearch API Profiling endpoint, and expose the profiling data in the visualization {profile: true}
.
This would be helpful to weight in each parts of a query.
/cc @JohnMcGuire