Draft: feat: poc for routing prompts based on percentage
What does this merge request do and why?
This MR demonstrates the following functionality:
- Reads
config.yml
from the root of the project. The file is expected to be of this format:
---
routers:
default:
type: percentage
models:
- model: claude-3-5-sonnet@20240620
provider: anthropic
router_params:
percentage: 40
- model: vertex_ai/claude-3-5-sonnet@20240620
provider: litellm
router_params:
percentage: 60
- This config defines the router that routes to either
claude-3-5-sonnet@20240620
model of Anthropic API orvertex_ai/claude-3-5-sonnet@20240620
based on the chance: 40% of traffic goes to the first, 60% goes to the latter
It's an example for the solution that can be applied to Implement a multi-provider strategy to reduce d... (gitlab-org&14873)
Verify
- Run AI Gateway and visit http://localhost:5052/docs
- Perform a request to the following URL
v1/prompts
with the following prompt-idchat/explain_code
and params:
{
"inputs": {
"input": "Please explain this Code",
"selected_text": "console.log('hello');",
"file_content": "console.log('hello');",
"language_info": "JavaScript"
},
"stream": false
}
- View the logs:
Sometimes it has the following output:
----------------
<dependency_injector.providers.Factory(<dependency_injector.providers.Dependency(<class 'object'>) at 0x174700d00, container name: "ContainerApplication.pkg_prompts.models.lite_llm_chat_fn">) at 0x174701120>
vertex_ai/claude-3-5-sonnet@20240620
----------------
Sometimes it's:
----------------
<dependency_injector.providers.Factory(<dependency_injector.providers.Dependency(<class 'object'>) at 0x174700c40, container name: "ContainerApplication.pkg_prompts.models.anthropic_claude_chat_fn">) at 0x174700be0>
claude-3-5-sonnet@20240620
----------------
Edited by Igor Drozdov