Iteration 3: Experiment with different prompts and compare how they perform on user satisfaction with response
<!--IssueSummary start-->
<details>
<summary>
Everyone can contribute. [Help move this issue forward](https://handbook.gitlab.com/handbook/marketing/developer-relations/contributor-success/community-contributors-workflows/#contributor-links) while earning points, leveling up and collecting rewards.
</summary>
- [Close this issue](https://contributors.gitlab.com/manage-issue?action=close&projectId=278964&issueIid=408649)
</details>
<!--IssueSummary end-->
# Goal of this issue
1. Before releasing Explain Code as GA, we should achieve that user feedback is in at least x% of cases `helpful` and in less then y% of cases `wrong`. The values for x and y remain to be determined.
1. We are likely going to switch to a different AI vendor. We should compare how the initial vendor compare to the new vendor to inform the business decision regarding consequences on user's satisfaction when switching vendors.
# Proposal
## Enhance the metrics for collecting user satisfaction with:
* `helpful`, `unhelpful`, `wrong` (already available as a result of https://gitlab.com/gitlab-org/gitlab/-/issues/404272+)
* number of lines (or characters or tokens) selected for explanation
- this will help us understand if the satisfaction is a function of the length of code selected
* number of characters (or tokens) of the answer
- this will help us understand if the satisfaction is a function of the length of the answer
* language of the code selected
- this will help us understand if the satisfaction is a function of the code language
* the prompt used and wether the selected code was before the prompt or after the prompt
- this will help us understand how different prompt designs perform
- **do not collect the code itself or the answer from the AI** to prevent collecting customer or user data.
* allow users to add a text message to explain their sentiment about the response or the feature as such.
* count the total number of times that
- users have received an AI answer vs.
- the times they also choose to provide feedback
- the times they asked a follow-up question
- and did not give feedback
- did give feedback
## Play with different prompts
* Use guidance like https://www.promptingguide.ai/ to engineer a hand full of prompts.
* Randomly use the different prompts and different providers.
* Present the results in Sisense.
- we intend to keep measuring user satisfaction also beyond GA, to be able to adjust prompts when needed
* Use the best performing response going forward.
issue