Trim long messages in Classic Duo Chat
Problem to solve
As a GitLab Duo Chat user, I want my conversations to work reliably even when they become long or include large context, so I can have productive multi-turn conversations without encountering "Prompt is too long" errors.
Current State: Classic Duo Chat (v2/chat/agent) does not implement any message trimming or context window management. When conversations accumulate too much content, the LLM API rejects requests with "Prompt is too long" errors, causing chat failures.
Evidence: Production errors:
litellm.exceptions.APIConnectionError: litellm.APIConnectionError: ****************** -
b'{"type":"error","error":{"type":"invalid_request_error","message":"Prompt is too long"},
"request_id":"req_vrtx_011CUdTRGyiESZ3GubByivDf"}'
Proposal
Implement message trimming in Classic Duo Chat to prevent "Prompt is too long" errors.
Note: Duo Workflow Service already has a proven trimming implementation in duo_workflow_service/entities/state.py that could potentially be reused or adapted, with additional work in progress Draft: feat(models): add support for personalis... (!3810)
Further details
User Impact
Current State:
- Users encounter "Prompt is too long" errors in production
- Long conversations with large context (files, diffs) cause failures
Expected After Fix:
- Conversations work reliably regardless of length
- Users can have longer, more productive conversations
- Large context (files, diffs) handled gracefully
When This Occurs
This affects long conversation that have many turns and includes large files or diffs
Links / references
Related Code:
- Classic Chat agent:
ai_gateway/chat/agents/react.py - Classic Chat endpoint:
ai_gateway/api/v2/chat/agent.py - Duo Workflow Service trimming (reference implementation):
duo_workflow_service/entities/state.py