Enable Tool Calling for gpt-oss-120b

Origin of issue

Jetstream2 user support ticket ATS-20529
John Chen has also requested this feature in the inference-service channel.

Problem to be addressed or Feature to be implemented

Enabling tool calling for Llama 4 Scout has opened new possibilities for our users. With the addition of gpt-oss-120b to our inference service, we're further expanding what our users can do with a flexible reasoning model that can better decide when to use or not use custom tools to accomplish a task, if such functionality is enabled.

vLLM documentation describes how to enable tool calling functionality for gpt-oss-120b:

https://docs.vllm.ai/projects/recipes/en/latest/OpenAI/GPT-OSS.html#function-calling

Checklist of tasks to resolve the issue

Attempt to restart vLLM / Llama 4 with the following engine arguments
- --tool-call-parser openai
- --enable-auto-tool-choice
If the server starts, test a tool call prompt.
Update inference service documentation indicating how to use the API with tool/function calling.

If this works, this is complete, and we are done (for the moment). If it doesn't work, consider an alternative approach.

Edited Oct 02, 2025 by Chris Martin