Pass Anthropic parameters to AI Gateway (!139499) · Merge requests · GitLab.org / GitLab

Shinya Maeda requested to merge pass-anthropic-parameters-to-v1-chat-agent-endpoint into master Dec 12, 2023

Notice: This MR is built on top of Ai Gateway client for Duo Chat (!138274 - merged)

What does this MR do and why?

This is a follow-up of Ai Gateway client for Duo Chat (!138274 - merged) and Build a client for AI Gateway to connect duo chat (#431563 - closed).

In the Llm::Chain::Requests::Anthropic, we're passing Stop Sequences and Temperature parameters to Anthropic. This MR sends the same parameters to the AI Gateway through the AI Gateway client.

This is a high priority MR in order for https://gitlab.com/groups/gitlab-org/-/epics/10585+ and Supporting GitLab Duo (chat) for SM and Dedicated (&11251 - closed).

AI Gateway counter-part: Support Anthropic params in v1 chat agent API (gitlab-org/modelops/applied-ml/code-suggestions/ai-assist!495 - merged) Support more Anthropic models in v1/chat/agent ... (gitlab-org/modelops/applied-ml/code-suggestions/ai-assist!499 - merged)

Closes Pass `stop_sequences` to AI Gateway `v1/agent/c... (#434925 - closed)

Screenshots or screen recordings

Test 1: with `Gitlab::Llm::AiGateway::Client`

Requesting to the AI Gateway from the GitLab-Rails:

user = User.first
options = { model: 'claude-instant-1', temperature: 0.5, max_tokens_to_sample: 1024, stop_sequences: ["\n\nHuman", "Test:"]}
Gitlab::Llm::AiGateway::Client.new(user).stream(prompt: "\n\nHuman: Can you sing a song?\n\nAssistant:", **options)

Confirm that the model, max_tokens_to_sample, stop_sequences, temperature params are passed to the Anthropic client in AI Gateway:

{
    "correlation_id": "25849b398c6e402f80f9ec2d6082c932",
    "logger": "anthropic._base_client",
    "level": "debug",
    "type": "mlops",
    "stage": "main",
    "timestamp": "2023-12-13T05:16:10.106478Z",
    "message": "Request options: {'method': 'post', 'url': '/v1/complete', 'timeout': Timeout(connect=5.0, read=30.0, write=30.0, pool=30.0), 'files': None, 'json_data': {'max_tokens_to_sample': 1024, 'model': 'claude-instant-1', 'prompt': '\\n\\nHuman: Can you sing a song?\\n\\nAssistant:', 'stop_sequences': ['\\n\\nHuman', 'Test:'], 'stream': True, 'temperature': 0.5}}"
}

Test 2: with `Gitlab::Llm::Chain::Requests::AiGateway`

Requesting to the AI Gateway via Gitlab::Llm::Chain::Requests::AiGateway:

user = User.first
prompt = { prompt: "\n\nHuman: Hi, How are you?\n\nAssistant:", options: { model: 'claude-instant-1.1', temperature: 0.6, max_tokens_to_sample: 1024, stop_sequences: ["\n\nHuman", "Hoge:"] } }
Gitlab::Llm::Chain::Requests::AiGateway.new(user).request(prompt)

Confirm that the model, max_tokens_to_sample, stop_sequences, temperature params are passed to the Anthropic client in AI Gateway:

{
    "correlation_id": "0b202483fca94043b867c51f7365c5c5",
    "logger": "anthropic._base_client",
    "level": "debug",
    "type": "mlops",
    "stage": "main",
    "timestamp": "2023-12-13T05:40:07.159732Z",
    "message": "Request options: {'method': 'post', 'url': '/v1/complete', 'timeout': Timeout(connect=5.0, read=30.0, write=30.0, pool=30.0), 'files': None, 'json_data': {'max_tokens_to_sample': 1024, 'model': 'claude-instant-1.1', 'prompt': '\\n\\nHuman: Hi, How are you?\\n\\nAssistant:', 'stop_sequences': ['\\n\\nHuman', 'Hoge:'], 'stream': True, 'temperature': 0.6}}"
}

Test 3: with `Gitlab::Llm::Chain::Requests::AiGateway` (Default parameters)

Requesting to the AI Gateway via Gitlab::Llm::Chain::Requests::AiGateway:

user = User.first
prompt = { prompt: "\n\nHuman: Hi, How are you?\n\nAssistant:" }
Gitlab::Llm::Chain::Requests::AiGateway.new(user).request(prompt)

Confirm that the default parameters are passed to the Anthropic client in AI Gateway:

{
    "correlation_id": "5a2baf1cafd844ee83c9e21a7d4bc780",
    "logger": "anthropic._base_client",
    "level": "debug",
    "type": "mlops",
    "stage": "main",
    "timestamp": "2023-12-13T05:42:57.334905Z",
    "message": "Request options: {'method': 'post', 'url': '/v1/complete', 'timeout': Timeout(connect=5.0, read=30.0, write=30.0, pool=30.0), 'files': None, 'json_data': {'max_tokens_to_sample': 2048, 'model': 'claude-2.0', 'prompt': '\\n\\nHuman: Hi, How are you?\\n\\nAssistant:', 'stop_sequences': ['\\n\\nHuman', 'Observation:'], 'stream': True, 'temperature': 0.1}}"
}

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

I have evaluated the MR acceptance checklist for this MR.

Edited Dec 13, 2023 by Shinya Maeda

Pass Anthropic parameters to AI Gateway

What does this MR do and why?

Screenshots or screen recordings

Test 1: with Gitlab::Llm::AiGateway::Client

Test 2: with Gitlab::Llm::Chain::Requests::AiGateway

Test 3: with Gitlab::Llm::Chain::Requests::AiGateway (Default parameters)

How to set up and validate locally

MR acceptance checklist

Merge request reports

Test 1: with `Gitlab::Llm::AiGateway::Client`

Test 2: with `Gitlab::Llm::Chain::Requests::AiGateway`

Test 3: with `Gitlab::Llm::Chain::Requests::AiGateway` (Default parameters)