feat: enable fine-grained-tool-streaming-2025-05-14
What does this merge request do and why?
It allows generating large tool-use messages. Without this header, an LLM buffers the generated JSON in order to validate it once it's generated.
According to the docs: https://docs.claude.com/en/docs/agents-and-tools/tool-use/fine-grained-tool-streaming, the major concern is that an invalid json maybe generated as a result, but considering that:
- We don't display partial json at the moment, but even when/if we do tool-use streaming, langchain/langgraph provides mechanisms to convert partial json into a valid one: https://python.langchain.com/api_reference/_modules/langchain_core/utils/json.html#parse_partial_json
- It's still much better than a silent timeout with a small piece of the content initially generated
If we run the example from #1465 (comment 2753349242) as:
curl -X POST "https://api.anthropic.com/v1/messages" \
-H "Content-Type: application/json" \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "anthropic-beta: extended-cache-ttl-2025-04-11" \
-H "anthropic-beta: fine-grained-tool-streaming-2025-05-14" -d @my.json
The message is properly streamed
Related to gitlab-org/gitlab#570575 (closed)
Edited by Dylan Griffith