Add MergeRequestReader to chat (!153616) · Merge requests · GitLab.org / GitLab

Merged Gosia Ksionek requested to merge mk-mr-for-chat into master 10 months ago

What does this MR do and why?

Addresses one popular context the chat currently does not support: https://gitlab.com/gitlab-org/ai-powered/duo-chat/discussions/-/issues/3+

Specifically this MR addresses Support Merge Requests as context for Duo Chat ... (#464587 - closed) • Lesley Razzaghian • 17.5 • Needs attention

Evaluation results

Here are the evaluation results from a collective LLM judge on the master branch

Here are the results on this branch

Here are the stats (averages)

Master branch

Correctness: 3.65

Readability: 3.75

Comprehensiveness: 3.43

This branch

Correctness: 3.71

Readability: 3.75

Comprehensiveness: 3.53

The improved results in the existing evaluation are likely not statistically significant, but at least it proves these changes do not degrade existing questions.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before	After

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

In rails console enable the experiment fully

Feature.enable(:ai_merge_request_reader_for_chat)

Visit merge request and ask question, for example: summarize this Merge request

What needs to be done

Task	Status	Notes
Run CEF locally on this branch vs master to ensure no degradation		See above in 'Evaluation results'
Create basic dataset for MR eval		Here
Change Chat REST API to accept MR requests
Add seed data for merge requests to be able to test locally in CEF		I believe this can be a followup
Make this work with 'v2_chat_agent_integration' feature flag on		This MR just needs to be merged after this one has been rolled out
Ask model validation team to add this dataset to daily runs, merge this MR and monitor eval results		Asked them to add it discussion