Use Vertex AI proxy endpoints in VertexAI::Client

NOTE: This is a high-priority MR for the deadline.

What does this merge request do and why?

This MR uses AI Gateway's Vertex AI proxy endpoints in VertexAI::Client. See AI Gateway ADR 002: Exposing proxy endpoints to AI providers for the overview of the changes.

This change is behind use_ai_gateway_proxy feature flag, which is disabled by default.

The main goal of these endpoints is to enable the independent AI features in self-managed instances within the proposed timeline. See the issue and this issue for more information.

Screenshots or screen recordings

Test VertexAi::Client from the Rails console:

[6] pry(main)>, unit_primitive: 'explain_vulnerability').chat(content: "Hi, how are you?")
  User Load (0.4ms)  SELECT "users".* FROM "users" ORDER BY "users"."id" ASC LIMIT 1
=> {"predictions"=>
  [{"safetyAttributes"=>[{"categories"=>[], "scores"=>[], "blocked"=>false}],
    "candidates"=>[{"content"=>" I'm doing great, thanks for asking! How can I help you today?", "author"=>"1"}]}],
 "metadata"=>{"tokenMetadata"=>{"outputTokenCount"=>{"totalBillableCharacters"=>50, "totalTokens"=>17}, "inputTokenCount"=>{"totalBillableCharacters"=>13, "totalTokens"=>6}}}}
AI Gateway log

Access request log:

    "url": "http://localhost/v1/proxy/vertex-ai/v1/projects/PROJECT/locations/LOCATION/publishers/google/models/codechat-bison:predict",
    "path": "/v1/proxy/vertex-ai/v1/projects/PROJECT/locations/LOCATION/publishers/google/models/codechat-bison%3Apredict",
    "status_code": 200,
    "method": "POST",
    "correlation_id": "34c3281ddd5e013305244f94c24d3166",
    "http_version": "1.1",
    "client_ip": "",
    "client_port": 45488,
    "duration_s": 4.9186735000002955,
    "duration_request": -1,
    "cpu_s": 0.06469550999999996,
    "user_agent": "Ruby",
    "gitlab_instance_id": "ba75b213-4fd4-4311-8631-0ac7a1bd3247",
    "gitlab_global_user_id": "Cv4L37An7TsFzTjzy4yCixBZwUsK8+TCQYl7EYHVN8c=",
    "gitlab_host_name": "gdk.test",
    "gitlab_saas_duo_pro_namespace_ids": null,
    "gitlab_saas_namespace_ids": null,
    "gitlab_realm": "saas",
    "auth_duration_s": 0.7781890819969703,
    "meta.feature_category": "vulnerability_management",
    "meta.unit_primitive": "explain_vulnerability",
    "logger": "api.access",
    "level": "info",
    "type": "mlops",
    "stage": "main",
    "timestamp": "2024-05-16T06:44:23.841030Z",
    "message": " - \"POST /v1/proxy/vertex-ai/v1/projects/PROJECT/locations/LOCATION/publishers/google/models/codechat-bison%3Apredict HTTP/1.1\" 200"

Proxy request log:

    "correlation_id": "34c3281ddd5e013305244f94c24d3166",
    "logger": "httpx",
    "level": "info",
    "type": "mlops",
    "stage": "main",
    "timestamp": "2024-05-16T06:44:23.838178Z",
    "message": "HTTP Request: POST \"HTTP/1.1 200 OK\""

How to set up and validate locally

  1. Checkout feat: use OR expression for the required scope ... (gitlab-org/modelops/applied-ml/code-suggestions/ai-assist!814 - merged) in AI Gateway.
  2. Checkout this MR in GitLab-Rails.
  3. Enable the feature flag ::Feature.enable(:use_ai_gateway_proxy).
  4. Follow Optional: Test with OIDC authentication section.

Further reading

