Skip to content

Draft: feat(models): add support for personalised context window tokens per model

What does this merge request do and why?

This merge request (MR) makes the context window size configurable per model, which enables proper handling of different models with varying context limits (like Claude Sonnet 4.5's 1M tokens vs GPT-5's 400K tokens).

Currently, MAX_CONTEXT_TOKENS is hardcoded at 400K, but different LLMs support different context window sizes. In summary, this MR:

  • Adds max_context_tokens configuration to model definitions and prompt configs
  • Extracts trimming logic into a shared utility module that accepts configurable context limits
  • Moves trimming from state management to invocation time, making sure that both legacy system and Flow Registry now trim conversation history right before sending to the LLM based on the model's specific context limit

This MR is needed because it allows us to maximize the context window for models that support larger contexts while respecting the limits of smaller models, and ensures the state preserves full conversation history with trimming happening only when needed.

References

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

Merge request checklist

  • Tests added for new functionality. If not, please raise an issue to follow up.
  • Documentation added/updated, if needed.
  • If this change requires executor implementation: verified that issues/MRs exist for both Go executor and Node executor or confirmed that changes are backward-compatible and don't break existing executor functionality.

Closes #1515

Edited by Fabrizio J. Piva

Merge request reports

Loading