Skip to content

Enable Tool Calling for gpt-oss-120b

Origin of issue

  • Jetstream2 user support ticket ATS-20529
  • John Chen has also requested this feature in the inference-service channel.

Problem to be addressed or Feature to be implemented

Enabling tool calling for Llama 4 Scout has opened new possibilities for our users. With the addition of gpt-oss-120b to our inference service, we're further expanding what our users can do with a flexible reasoning model that can better decide when to use or not use custom tools to accomplish a task, if such functionality is enabled.

vLLM documentation describes how to enable tool calling functionality for gpt-oss-120b:

Checklist of tasks to resolve the issue

  • Attempt to restart vLLM / Llama 4 with the following engine arguments
    • --tool-call-parser openai
    • --enable-auto-tool-choice
  • If the server starts, test a tool call prompt.
  • Update inference service documentation indicating how to use the API with tool/function calling.

If this works, this is complete, and we are done (for the moment). If it doesn't work, consider an alternative approach.

Edited by Chris Martin