PoC: Connect GLQL to ES
includeSubgroup
Connect GLQL Queries to Elasticsearch for Overview
GLQL is a powerful tool for planning work across the group hierarchy. However, some GLQL queries, especially those with includeSubgroup
, can be slow (see gitlab-org/gitlab-query-language/glql-rust!106). For example, in the gitlab-org
group, including subgroups means searching through over 5000 projects, which can significantly impact performance.
Proposal
Let's route a set of heavy GLQL queries to ES.
For this PoC, we'll narrow the scope to the fields that are already indexed in ES. ES is well-suited for handling these heavy queries and helping us scale, and the search team has already implemented several optimizations (slack thread).
Key Points
Performance: Offloading complex queries to ES can reduce the load on our primary database.
Scalability:
- For large groups like
gitlab-org
, where including subgroups means scanning over 5000 projects, this approach can significantly improve query response times. - It also provides a possibility for large self-hosted customers to scale.
Future Flexibility: Integrating with ES now lets us monitor and optimize heavy queries. As ES adoption grows, we can eventually standardize on ES for GLQL processing.
Things to consider
Currently, only about 20% of self-managed instances use ES by default. Until ES adoption increases, new GLQL features might need to support both GraphQL and ES implementations.
Next Steps (WIP)
- List heavy GLQL queries we want to address first
- ?