Implement conversation compaction using summarization
Problem to solve
As a Duo Workflow user, I want important context preserved when conversation history is trimmed, so that the LLM can make better decisions based on previous interactions.
Currently, when conversation history exceeds the context budget, old messages are dropped entirely. This loses important context like previous decisions, encountered errors, and task progress.
Proposal
Generate an LLM summary of old messages before dropping them.
- Before trimming, identify messages to be removed
- Generate concise summary capturing key information (decisions, errors, progress)
- Replace old messages with summary message
- Keep recent messages intact
Further details
Current behavior: Old messages dropped, context lost permanently
Expected behavior: Old messages summarized, key context preserved
Dependencies: Benefits from #1861 (closed) (lazy trimming) and #1862 (accurate token counting)
Links / references
duo_workflow_service/conversation/trimmer.py-
duo_workflow_service/entities/state.py(_conversation_history_reducer)
Edited by Junming Huang