feat: enable fine-grained-tool-streaming-2025-05-14 (!3459) · Merge requests · GitLab.org / ModelOps / AI Assisted (formerly Applied ML) / Code Suggestions / AI Gateway

What does this merge request do and why?

It allows generating large tool-use messages. Without this header, an LLM buffers the generated JSON in order to validate it once it's generated.

According to the docs: https://docs.claude.com/en/docs/agents-and-tools/tool-use/fine-grained-tool-streaming, the major concern is that an invalid json maybe generated as a result, but considering that:

We don't display partial json at the moment, but even when/if we do tool-use streaming, langchain/langgraph provides mechanisms to convert partial json into a valid one: https://python.langchain.com/api_reference/_modules/langchain_core/utils/json.html#parse_partial_json
It's still much better than a silent timeout with a small piece of the content initially generated

If we run the example from #1465 (comment 2753349242) as:

curl -X POST "https://api.anthropic.com/v1/messages" \
  -H "Content-Type: application/json" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "anthropic-beta: extended-cache-ttl-2025-04-11" \
  -H "anthropic-beta: fine-grained-tool-streaming-2025-05-14" -d @my.json

The message is properly streamed

Edited Oct 01, 2025 by Dylan Griffith

feat: enable fine-grained-tool-streaming-2025-05-14

What does this merge request do and why?

Merge request reports