Enable Tool Calling for gpt-oss-120b
Origin of issue
- Jetstream2 user support ticket ATS-20529
- John Chen has also requested this feature in the inference-service channel.
Problem to be addressed or Feature to be implemented
Enabling tool calling for Llama 4 Scout has opened new possibilities for our users. With the addition of gpt-oss-120b to our inference service, we're further expanding what our users can do with a flexible reasoning model that can better decide when to use or not use custom tools to accomplish a task, if such functionality is enabled.
vLLM documentation describes how to enable tool calling functionality for gpt-oss-120b:
Checklist of tasks to resolve the issue
-
Attempt to restart vLLM / Llama 4 with the following engine arguments --tool-call-parser openai
--enable-auto-tool-choice
-
If the server starts, test a tool call prompt. -
Update inference service documentation indicating how to use the API with tool/function calling.
If this works, this is complete, and we are done (for the moment). If it doesn't work, consider an alternative approach.
Edited by Chris Martin