Classic Duo Chat - Output Truncated
### Summary When classic duo chat is prompted to generate a large message, the output can be cut of half way through and appear truncated due to passing the max_tokens limit. Similar issues for the slash commands are: [/fix](https://gitlab.com/gitlab-org/gitlab/-/work_items/582842) Fixed by having the LLM identify and fix the most important issues first (priority order provided to LLM) [/refactor](https://gitlab.com/gitlab-org/gitlab/-/work_items/579683) Fixed by having the LLM choose a snippet and refactoring that snippet only [/tests](https://gitlab.com/gitlab-org/gitlab/-/work_items/575987) Fixed by having the LLM output the tests in chunks over multiple messages ### Steps to reproduce 1. Type into classic duo chat (Web or IDE) "Generate a code snippet over 800 lines" 2. Output will likely cut off mid way and look like: ![image.png](/uploads/493130f9c1ed9a6ef2abdac938f6bf9a/image.png){width="490" height="375"} A real world example would be: 1. In classic chat in an IDE highlight the entire user.rb file of the rails monolith (3k lines) 2. Type "provide a summary of every method in this file with an example of when you would call them and a code snippet showing the method call" 3. Output will look like: ``` ... Abuse & Trust trusted? When to call: Bypassing spam checks unless user.trusted? SpamCheckService.new(issue).execute end abuse_metadata When to call: Reporting abuse AbuseReport.create( user: reported_user, reporter: current_user, metadata: reported_user.abuse_metadata ) **CI ``` ### What is the current _bug_ behavior? Chat message is not always completed ### What is the expected _correct_ behavior? Chat output will be completed, or at least a clear error is displayed to the users asking them to reduce the context. ### Possible fixes 1. Have the LLM identify when a prompt is too large and inform the user 2. Automatically detect when max_tokens are fully used and display an error to the user 3. There are likely more as well
issue