Skip to content

Log all Vertex AI API errors

Tan Le requested to merge 314-track-vertex-ai-error-status-codes into main

What does this merge request do and why?

Before this change, we only capture 400 Bad Request and 500 Internal Server Error from Vertex AI API in the model_exception.* key in the log. Since we do not explicitly capture other error codes such as 429 Too Many Requests, these errors will be propagated as 500 Internal Server Error from the AI Gateway. This makes:

  • Troubleshooting hard as we need to extract the code from exception.message.
  • The log verbose with stack trace raised deep in Google Core API library.
  • End-users see 500 Internal Server Errors on all unhandled third-party requests.

This MR makes code completions and generations returns 200 with empty response while logging all Vertex AI API errors under model_exception.* key in the log. These errors include:

  • Errors that have HTTP status codes, such as 400, 503 (code).
  • Errors that do not have HTTP status codes, such as retry (code).

This is also make the API interface consistent with how Anthropic responses are handled (code).

How to set up and validate locally

  1. Check out to this merge request's branch.
  2. Ensure a local Docker image built successfully.
    docker buildx build --platform linux/amd64 -t ai-gateway:test .
  3. Run a local service on Docker.
    docker run --platform linux/amd64 --rm \
      -p 5052:5052 \
      -e AUTH_BYPASS_EXTERNAL=true \
      -e VERTEX_API_ENDPOINT="bomberman.gdk.test"
      -v $PWD:/app -it ai-gateway:test
  4. Send a cURL request to the /v2/completions endpoint
    $ curl --request POST \
      --url http://codesuggestions.gdk.test:5052/v2/completions \
      --header 'Content-Type: application/json' \
      --header 'X-Gitlab-Authentication-Type: oidc' \
      --header 'authorization: Bearer jwt \
      --data '{
      "prompt_version": 1,
      "project_path": "gitlab-org/gitlab",
      "project_id": 278964,
      "current_file": {
        "file_name": "main.py",
        "content_above_cursor": "# complete this world\n",
        "content_below_cursor": ""
      }
    }'
  5. Observe a log entry with correct attributes (formatted for legibility)
    {
      "status_code": 200,
      ...
      "model_exception_message": "503 Vertex Model API error: dns resolution failed for bomberman.gdk.test:443: c-ares status is not ares_success qtype=a name=bomberman.gdk.test is_balancer=0: domain name not found",
      "model_exception_status_code": 503,
      ...
    }

Before this MR, the above request will result in 500 status code and the following payload.

{
  "status_code": 500,
  ...
  "exception": {
    "message": "503 DNS resolution failed for bomberman.gdk.test:443: C-ares status is not ARES_SUCCESS qtype=A name=bomberman.gdk.test is_balancer=0: Domain name not found",
    "backtrace": "Traceback (most recent call last):\n  File \"/opt/venv/ai-gateway-9TtSrW0h-py3.9/lib/python3.9/site-packages/google/api_core/grpc_helpers_async.py\", <snip> google.api_core.exceptions.ServiceUnavailable: 503 DNS resolution failed for bomberman.gdk.test:443: C-ares status is not ARES_SUCCESS qtype=A name=bomberman.gdk.test is_balancer=0: Domain name not found\n"
  },
  ...
}

Merge request checklist

  • Tests added for new functionality. If not, please raise an issue to follow up.
  • Documentation added/updated, if needed.

Relates to #314

Edited by Tan Le

Merge request reports