Experiment: natural language querying in visualization designer
⚠ Promoted to the following epic Customer MVC: natural language querying in visu... (&12245)
Experiment
This section should be completed prior to work on the Experiment beginning.
Experiment
Problem to be solved
User problem
Creating the SQL needed to create a visualization is difficult so I don't create custom visualizations or dashboards for Product Analytics.
Solution hypothesis
If GitLab provides an interface for users (Product/Engineering Managers) to ask questions about Product Analytics data, then they will create more custom visualizations and same them to custom dashboards.
Assumption
What assumptions are you making about this problem and the solution?
- We assume customers have instrumented an app but are not getting value from the default dashboards.
- We assume customers have specific questions they can ask about, like "How many visitors did we have this week compared to last week"
Personas
What personas have this problem, who is the intended user?
- Engineering Manager
- Product Manager
Proposal
Product Analytics-specific implementation of natural language querying. May be less useful in the visualization designer but the interface might be viable in the context of data exploration.
Success
How will you measure whether this experiment is a success?
We can measure how many custom visualizations are created after launch of the experiment compared to before / onboarded project.
Possible Solutions
- Using langchain (json agent) for an implementation like https://www.vizgpt.ai/
- Utilizing a chat interface like how Cube does with Delphi and Slack
Relates to https://gitlab.com/gitlab-org/gitlab/-/issues/393881+
Feature release
Main Job story
What job to be done will this solve?
Proposal updates/additions
Problem validation
What validation exists that customers have this problem?
Business objective
What business objective will be achieved with this proposal?
Confidence
Has this proposal been derived from research?
Confidence | Research |
---|---|
[High/Medium/Low] | research/insight issue |
Requirements
What tasks or actions should the user be capable of performing with this feature?
⚠ ️ Related feature and research issues should be linked in the related issues section (Delete this line when this is done)
The user needs to be able to:
- ...
- ...
Checklist
Experiment
Issue information
-
Add information to the issue body about: -
The user problem being solved -
Your assumptions -
Who it's for, list of personas impacted -
Your proposal
-
-
Add relevant designs to the Design Management area of the issue if available -
Confirm that an unexpected outage of this feature will not negatively impact the application or other features -
Add a feature flag so that this feature can be quickly disabled if/when needed -
If this experiment introduces a new service or data store, ensure it is not processing or storing red data without a security and if needed legal review - NOTE: We recommend using one of the already adopted models or data stores. If you need to use something else, be aware that using other models or data stores will require additional review during the feature stage for operational fitness and compliance.
-
Ensure this issue has the wg-ai-integration label to ensure visibility to various teams working on this
Feature release
Issue information
-
Add information to the issue body about: -
Your proposal -
The Job Statement it's expected to satisfy -
Details about the user problem and provide any research or problem validation -
List the personas impacted by the proposal.
-
-
Add all relevant solution validation issues to the Linked items section that shows this proposal will solve the customer problem, or details explaining why it's not possible to provide that validation. -
Add relevant designs to the Design Management area of the issue. -
You have adhered to our Definition of Done standards -
Ensure this issue has the wg-ai-integration label to ensure visibility to various teams working on this
Technical needs
-
Please consider the operational aspects of the feature you are creating. A list of things to think about is in: https://gitlab.com/gitlab-org/gitlab/-/issues/403859. We will be improving this process in the future: !117637 (comment 1353253349). -
@ mention your AppSec Stable Counterpart and read the AI secure coding guidelines
- Work estimate and skills needs to build an ML viable feature: To build any ML feature depending on the work, there are many personas that contribute including, Data Scientist, NLP engineer, ML Engineer, MLOps Engineer, ML Infra engineers, and Fullstack engineer to integrate the ML Services with Gitlab. Post-prototype we would assess the skills needed to build a production-grade ML feature for the prototype.
- Data Limitation: We would like to upfront validate if we have viable data for the feature including whether we can use the DataOps pipeline of ModelOps or create a custom one. We would want to understand the training data, test data, and feedback data to dial up the accuracy and the limitations of the data.
- Model Limitation: We would want to understand if we can use an open-source pre-trained model, tune and customize it or start a model from scratch as well. Further, we would assess based on the ModelOps model evaluation framework which would be the right model to use based on the use case.
- Cost, Scalability, Reliability: We would want to estimate the cost of hosting, serving, inference of the model, and the full end-to-end infrastructure including monitoring and observability.
- Legal and Ethical Framework: We would want to align with legal and ethical framework like any other ModelOps features to cover across the nine principles of responsible ML and any legal support needed.
Dependency needs
-
Please consider the operational aspects of the service you are creating. A list of things to think about is in: https://gitlab.com/gitlab-org/gitlab/-/issues/403859. We will be improving this process in the future: !117637 (comment 1353253349).
Legal needs
-
TBD
Implementation Notes
-
Prompt Engineering:
- Limited success asking LLM to output cube.js queries, tends to get confused and include SQL inside measures unnecessarily
- More success with asking the LLM to give a list of measures, dimensions and filters that satisfy the users question from a given list of available data in our schema, then fit that into our cube.js query
- Anthropic does well returning XML
- try YAML?
- Use the
gitlab:llm:zero_shot:test:questions
rake task for evaluating prompts - See ChainOfThoughtParser for example of parsing LLM output
-
Implementation Plan
- Implement
GenerateAnalyticsQueryService
(see initial MR for example) - Generate discrete choices for LLM to query from our available analytics data schema and feed that into LLM prompt (i.e. 'here is a list of the available data, give me which conditions satisfy the user's question')
- Without a list of data to choose from, LLM will be much more likely to hallucinate
- Output from LLM can be easily validated against this list to ensure cube.js query is valid
-
Extract the list of attributes and filters that satisfy the user's query from the LLM and use that to build a valid cube.js query
- These conditions might need to be shown to the user so they can validate them
- Use this query to create a custom visualization for the user
- Implement
Additional resources
- If you'd like help with technical validation, or would like to discuss UX considerations for AI mention the AI Assisted group using
@gitlab-org/modelops/applied-ml
. - Read about our AI Integration strategy
- Slack channels
-
#wg_ai_integration
- Slack channel for the working group and the high level alignment on getting AI ready for Production (Development, Product, UX, Legal, etc.) But from the other channels fell free to reach out and post progress here -
#ai_integration_dev_lobby
- Channel for all implementation related topics and discussions of actual AI features (e.g. explain the code) -
#ai_enablement_team
- Channel for the AI Enablement Team which is building the base for all features (experimentation API, Abstraction Layer, Embeddings, etc.)
-