Execute chat requests via new endpoint in AI Gateway
Problem
In Implement a ReAct prompt on AI Gateway (#452204 - closed) we're implementing on AI Gateway a new endpoint to send chat question. It's needed so we can filter chat tools on Gateway. With that we don't need to assemble the chat prompt on Rails monolith and we can simplify the code.
Proposal
- Create a new
SingleActionExecutor
that will send requests to LLM via a new chat Gateway endpoint. This executor should implement the same decision loop likeZeroShot::Executor
. - Current
AiGateway::Client
has hardcoded chat endpoint that should be changed. - The new executor should be behind a feature flag, so we can incrementally release it.
How it should look:
# ee/lib/gitlab/llm/completions/chat.rb
if Feature.enabled?(:chat_react_prompt_on_gateway_experiment)
Gitlab::Llm::Chain::Agents::SingleAction::Executor.new(
user_input: prompt_message.content,
context: context,
response_handler: response_handler,
stream_response_handler: stream_response_handler
).execute
else
Gitlab::Llm::Chain::Agents::ZeroShot::Executor.new(
user_input: prompt_message.content,
tools: tools,
context: context,
response_handler: response_handler,
stream_response_handler: stream_response_handler
).execute
end
# single_action_executor
def execute
MAX_ITERATIONS.times do
request = execute_streamed_request
return request[:final_answer] if request[:final_answer]
answer = Answer.from(thought: request[:thought], action: [:action], tools: tools, context: context)
options[:agent_scratchpad] << "\nThought: #{answer.suggestions}"
options[:agent_scratchpad] << answer.content.to_s
tool_class = answer.tool
picked_tool_action(tool_class)
tool = tool_class.new(
context: context,
options: {
input: user_input,
suggestions: options[:agent_scratchpad]
},
stream_response_handler: stream_response_handler
)
tool_answer = tool.execute
return tool_answer if tool_answer.is_final?
options[:agent_scratchpad] << "Observation: #{tool_answer.content}\n"
end
Answer.default_final_answer(context: context)
end
Note: there is a lot of opportunities for refactoring and reusing the code.
Edited by Tetiana Chupryna