Draft: feat(models): add support for personalised context window tokens per model
What does this merge request do and why?
This merge request (MR) makes the context window size configurable per model, which enables proper handling of different models with varying context limits (like Claude Sonnet 4.5's 1M tokens vs GPT-5's 400K tokens).
Currently, MAX_CONTEXT_TOKENS is hardcoded at 400K, but different LLMs support different context window sizes. In summary, this MR:
- Adds
max_context_tokensconfiguration to model definitions and prompt configs - Extracts trimming logic into a shared utility module that accepts configurable context limits
- Moves trimming from state management to invocation time, making sure that both legacy system and Flow Registry now trim conversation history right before sending to the LLM based on the model's specific context limit
This MR is needed because it allows us to maximize the context window for models that support larger contexts while respecting the limits of smaller models, and ensures the state preserves full conversation history with trimming happening only when needed.
References
- Parent issue: #1515
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
Merge request checklist
-
Tests added for new functionality. If not, please raise an issue to follow up. -
Documentation added/updated, if needed. -
If this change requires executor implementation: verified that issues/MRs exist for both Go executor and Node executor or confirmed that changes are backward-compatible and don't break existing executor functionality.
Closes #1515
Edited by Fabrizio J. Piva