Skip to content

Trim long messages in Classic Duo Chat

Problem to solve

As a GitLab Duo Chat user, I want my conversations to work reliably even when they become long or include large context, so I can have productive multi-turn conversations without encountering "Prompt is too long" errors.

Current State: Classic Duo Chat (v2/chat/agent) does not implement any message trimming or context window management. When conversations accumulate too much content, the LLM API rejects requests with "Prompt is too long" errors, causing chat failures.

Evidence: Production errors:

litellm.exceptions.APIConnectionError: litellm.APIConnectionError: ****************** - 
b'{"type":"error","error":{"type":"invalid_request_error","message":"Prompt is too long"},
"request_id":"req_vrtx_011CUdTRGyiESZ3GubByivDf"}'

Proposal

Implement message trimming in Classic Duo Chat to prevent "Prompt is too long" errors.

Note: Duo Workflow Service already has a proven trimming implementation in duo_workflow_service/entities/state.py that could potentially be reused or adapted, with additional work in progress Draft: feat(models): add support for personalis... (!3810)

Further details

User Impact

Current State:

  • Users encounter "Prompt is too long" errors in production
  • Long conversations with large context (files, diffs) cause failures

Expected After Fix:

  • Conversations work reliably regardless of length
  • Users can have longer, more productive conversations
  • Large context (files, diffs) handled gracefully

When This Occurs

This affects long conversation that have many turns and includes large files or diffs

Links / references

Related Code:

  • Classic Chat agent: ai_gateway/chat/agents/react.py
  • Classic Chat endpoint: ai_gateway/api/v2/chat/agent.py
  • Duo Workflow Service trimming (reference implementation): duo_workflow_service/entities/state.py