Skip to content

Draft: Use Anthropic & Vertex AI proxy endpoints

Shinya Maeda requested to merge anthropic-vertexai-ai-gateway-support into master

What does this MR do and why?

This MR demonstrates how to use Anthropic & Vertex AI proxy endpoints introduced in AI Gateway ADR 002: Exposing proxy endpoints to AI providers. It also contains documentation update about how to test with OIDC.

This MR requires feat: use OR expression for the required scope ... (gitlab-org/modelops/applied-ml/code-suggestions/ai-assist!814 - merged) in AI Gateway.

Ref: Refactor current Anthropic and Vertex client (#458207 - closed)

MR acceptance checklist

Anthropic Example:

[1] pry(main)> Gitlab::Llm::Anthropic::Client.new(User.first, unit_primitive: 'explain_vulnerability').complete(prompt: "\n\nHuman: What is a security vulnerability?\n\nAssistant:")
=> {"type"=>"completion",
 "id"=>"compl_01STe9ZxKyVAkieJfrNiQZsC",
 "completion"=>
  " A security vulnerability is a weakness in a system that allows an attacker to violate the system's intended security policy. Some examples of security vulnerabilities include:\n\n- Buffer overflows - This allows attackers to overwrite memory areas, allowing them to execute malicious code or crash the system.\n\n- SQL injection - This allows attackers to insert malicious SQL code into input fields for execution, allowing them to access unauthorized data.\n\n- Cross-site scripting (XSS) - This allows attackers to inject client-side scripts into web pages viewed by other users, letting them bypass access controls.\n\n- Missing access controls - This allows attackers access to resources or functionality that should be restricted.\n\n- Weak passwords - Easy to guess passwords allow attackers access to systems with a valid username and password. \n\n- Unpatched software - Running unpatched or outdated software that has publicly known vulnerabilities provides an opening for attackers.\n\nThe common factor is that vulnerabilities allow attackers to compromise confidentiality, integrity, or availability in a way that undermines security policies - whether to gain unauthorized access, escalate user privileges, deny service, or expose private information. Identifying and remediating vulnerabilities is crucial for good security practices.",
 "stop_reason"=>"stop_sequence",
 "model"=>"claude-2.1",
 "stop"=>"\n\nHuman:",
 "log_id"=>"compl_01STe9ZxKyVAkieJfrNiQZsC"}

Vertex AI Example:

[1] pry(main)> Gitlab::Llm::VertexAi::Client.new(User.first, unit_primitive: 'explain_vulnerability').chat(content: "Hi, how are you?")
=> {"predictions"=>
  [{"candidates"=>[{"content"=>" I'm doing great, thanks for asking! How can I help you today?", "author"=>"1"}],
    "groundingMetadata"=>[{}],
    "safetyAttributes"=>[{"scores"=>[], "blocked"=>false, "categories"=>[]}],
    "citationMetadata"=>[{"citations"=>[]}]}],
 "metadata"=>{"tokenMetadata"=>{"inputTokenCount"=>{"totalBillableCharacters"=>13, "totalTokens"=>6}, "outputTokenCount"=>{"totalBillableCharacters"=>50, "totalTokens"=>17}}}}

[3] pry(main)> Gitlab::Llm::VertexAi::Client.new(User.first, unit_primitive: 'explain_vulnerability').messages_chat(content: [{"author": "user", "content": "Hi, how are you?"}])
 User Load (0.5ms)  SELECT "users".* FROM "users" ORDER BY "users"."id" ASC LIMIT 1 /*application:console,correlation_id:88f166e90c33d936629b8082e2d3681e,db_config_name:main,console_hostname:shinya-XPS-15-9530,console_username:shinya,line:/app/models/concerns/use_sql_function_for_primary_key_lookups.rb:8:in `_query_by_sql'*/
=> {"predictions"=>
 [{"safetyAttributes"=>
    [{"blocked"=>false,
      "scores"=>[0.1, 0.3],
      "safetyRatings"=>
       [{"probabilityScore"=>0.2, "severityScore"=>0.1, "category"=>"Dangerous Content", "severity"=>"NEGLIGIBLE"},
        {"probabilityScore"=>0.1, "category"=>"Harassment", "severity"=>"NEGLIGIBLE", "severityScore"=>0},
        {"severityScore"=>0, "probabilityScore"=>0, "severity"=>"NEGLIGIBLE", "category"=>"Hate Speech"},
        {"category"=>"Sexually Explicit", "severityScore"=>0.1, "severity"=>"NEGLIGIBLE", "probabilityScore"=>0.3}],
      "categories"=>["Insult", "Sexual"]}],
   "groundingMetadata"=>[{}],
   "citationMetadata"=>[{"citations"=>[]}],
   "candidates"=>[{"content"=>" I'm doing great, thanks for asking! How about you?", "author"=>"1"}]}],
"metadata"=>{"tokenMetadata"=>{"outputTokenCount"=>{"totalBillableCharacters"=>42, "totalTokens"=>14}, "inputTokenCount"=>{"totalBillableCharacters"=>13, "totalTokens"=>6}}}}


[4] pry(main)> Gitlab::Llm::VertexAi::Client.new(User.first, unit_primitive: 'explain_vulnerability').text(content: "Hi, how are you?")
=> {"predictions"=>
  [{"content"=>" As an AI language model, I don't have personal feelings or emotions, but I'm here to assist you with any questions or information you may need. How can I help you today?",
    "citationMetadata"=>{"citations"=>[]},
    "safetyAttributes"=>
     {"scores"=>[0.1, 0.2, 0.2, 0.1],
      "categories"=>["Finance", "Health", "Religion & Belief", "Sexual"],
      "blocked"=>false,
      "safetyRatings"=>
       [{"severityScore"=>0, "probabilityScore"=>0, "category"=>"Dangerous Content", "severity"=>"NEGLIGIBLE"},
        {"severityScore"=>0, "category"=>"Harassment", "severity"=>"NEGLIGIBLE", "probabilityScore"=>0},
        {"severity"=>"NEGLIGIBLE", "probabilityScore"=>0, "category"=>"Hate Speech", "severityScore"=>0},
        {"severityScore"=>0.1, "probabilityScore"=>0.1, "category"=>"Sexually Explicit", "severity"=>"NEGLIGIBLE"}]}}],
 "metadata"=>{"tokenMetadata"=>{"outputTokenCount"=>{"totalBillableCharacters"=>138, "totalTokens"=>40}, "inputTokenCount"=>{"totalBillableCharacters"=>13, "totalTokens"=>6}}}}

 [5] pry(main)> Gitlab::Llm::VertexAi::Client.new(User.first, unit_primitive: 'explain_vulnerability').code(content: "print('Hello")
=> {"predictions"=>[{"citationMetadata"=>{"citations"=>[]}, "safetyAttributes"=>{"categories"=>[], "scores"=>[], "blocked"=>false}, "content"=>" world!')", "score"=>-2.214948892593384}],
"metadata"=>{"tokenMetadata"=>{"outputTokenCount"=>{"totalBillableCharacters"=>8, "totalTokens"=>2}, "inputTokenCount"=>{"totalBillableCharacters"=>12, "totalTokens"=>3}}}}

[6] pry(main)> Gitlab::Llm::VertexAi::Client.new(User.first, unit_primitive: 'explain_vulnerability').code_completion(content: {"prefix": "print('Hello", "suffix": ""})
=> {"predictions"=>[{"safetyAttributes"=>{"blocked"=>false, "scores"=>[], "categories"=>[]}, "content"=>" World !')", "score"=>-3.305152654647827, "citationMetadata"=>{"citations"=>[]}}],
 "metadata"=>{"tokenMetadata"=>{"outputTokenCount"=>{"totalBillableCharacters"=>8, "totalTokens"=>6}, "inputTokenCount"=>{"totalBillableCharacters"=>12, "totalTokens"=>3}}}}

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

Edited by Shinya Maeda

Merge request reports