Draft: feat: poc for routing prompts based on percentage (!1233) · Merge requests · GitLab.org / ModelOps / AI Assisted (formerly Applied ML) / Code Suggestions / AI Gateway

Igor Drozdov requested to merge id-prompts-router-poc into main Aug 19, 2024

What does this merge request do and why?

This MR demonstrates the following functionality:

Reads config.yml from the root of the project. The file is expected to be of this format:

---
routers:
  default:
    type: percentage
    models:
      - model: claude-3-5-sonnet@20240620
        provider: anthropic
        router_params:
          percentage: 40
      - model: vertex_ai/claude-3-5-sonnet@20240620
        provider: litellm
        router_params:
          percentage: 60

This config defines the router that routes to either claude-3-5-sonnet@20240620 model of Anthropic API or vertex_ai/claude-3-5-sonnet@20240620 based on the chance: 40% of traffic goes to the first, 60% goes to the latter

It's an example for the solution that can be applied to Implement a multi-provider strategy to reduce d... (gitlab-org&14873)

Verify

Run AI Gateway and visit http://localhost:5052/docs
Perform a request to the following URL v1/prompts with the following prompt-id chat/explain_code and params:

{
  "inputs": {
    "input": "Please explain this Code",
    "selected_text": "console.log('hello');",
    "file_content": "console.log('hello');",
    "language_info": "JavaScript"
  },
  "stream": false
}

View the logs:

Sometimes it has the following output:

----------------
<dependency_injector.providers.Factory(<dependency_injector.providers.Dependency(<class 'object'>) at 0x174700d00, container name: "ContainerApplication.pkg_prompts.models.lite_llm_chat_fn">) at 0x174701120>
vertex_ai/claude-3-5-sonnet@20240620
----------------

Sometimes it's:

----------------
<dependency_injector.providers.Factory(<dependency_injector.providers.Dependency(<class 'object'>) at 0x174700c40, container name: "ContainerApplication.pkg_prompts.models.anthropic_claude_chat_fn">) at 0x174700be0>
claude-3-5-sonnet@20240620
----------------

Edited Aug 19, 2024 by Igor Drozdov

Draft: feat: poc for routing prompts based on percentage

What does this merge request do and why?

Verify

Merge request reports