Skip to content

fix(dependency): unnecessary clients are initialized at application boot

What does this merge request do and why?

This MR fixes a long standing issue in AI Gateway. It currently initializes the 3rd party model clients at application boot, so that we have to setup VertexAI and Anthropic credentials before launching the server, otherwise it doesn't boot. It's quote annoying that developers need to get VertexAI credential regardless of that their feature only uses Anthropic.

The main issue here is the use of Resource provider. This provider actually is intended for managing resources that needs to be terminated explicitly (like a daemon), such as:

Resource providers help to initialize and configure logging, event loop, thread or process pool, etc.

Since our clients are initialized on the same process ID of the FastAPI server, threads and sub-processes are cascadingly terminated when the main server is shutdown. Hence, we don't need to explicitly terminate them.

Instead, we use Singleton provider. Singleton instances are initialized only once and reused for the rest of the invocation. The clients won't be initialized until it's explicitly called by requests, which is similar to how LiteLLM initializes the clients.

Related to:

Contributes to Streamline installation of AI Gateway to air-ga... (gitlab-org/gitlab#471542 - closed)

How to set up and validate locally

  1. logout from gcloud:

    gcloud auth application-default revoke
  2. Start ai gateway. localhost:5052/docs should be accessible

Request to Anthropic:

curl -X 'POST' \
  'http://0.0.0.0:5052/v1/chat/agent' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "prompt_components": [
    {
      "type": "string",
      "metadata": {
        "source": "string",
        "version": "string"
      },
      "payload": {
        "content": "\n\nHuman: Hi, How are you?\n\nAssistant:",
        "provider": "anthropic",
        "model": "claude-2.1",
        "params": {
          "stop_sequences": [
            "\n\nHuman",
            "Observation:"
          ],
          "temperature": 0.2,
          "max_tokens_to_sample": 2048
        },
        "model_endpoint": "string",
        "model_api_key": "string"
      }
    }
  ],
  "stream": false
}'

Response:

{"response":" I'm doing well, thanks for asking!","metadata":{"provider":"anthropic","model":"claude-2.1","timestamp":1720760783}}

Request to VertexAI:

curl -X 'POST' \
  'http://0.0.0.0:5052/v2/code/completions' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "project_path": "string",
  "project_id": 0,
  "current_file": {
    "file_name": "string",
    "language_identifier": "string",
    "content_above_cursor": "print",
    "content_below_cursor": ""
  },
  "model_provider": "vertex-ai",
  "model_endpoint": "string",
  "model_api_key": "string",
  "model_name": "code-gecko@002",
  "telemetry": [],
  "stream": false,
  "choices_count": 0,
  "context": [],
  "prompt_version": 1
}'

Response:

500: Internal Server Error

Backtrace:

{
  "status_code": null,
  "exception_class": "DefaultCredentialsError",
  "backtrace": "Traceback (most recent call last):\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/anyio/streams/memory.py\", line 97, in receive\n    return self.receive_nowait()\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/anyio/streams/memory.py\", line 92, in receive_nowait\n    raise WouldBlock\nanyio.WouldBlock\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/starlette/middleware/base.py\", line 159, in call_next\n    message = await recv_stream.receive()\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/anyio/streams/memory.py\", line 112, in receive\n    raise EndOfStream\nanyio.EndOfStream\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/ai_gateway/api/middleware.py\", line 97, in dispatch\n    response = await call_next(request)\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/starlette/middleware/base.py\", line 165, in call_next\n    raise app_exc\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/starlette/middleware/base.py\", line 151, in coro\n    await self.app(scope, receive_or_disconnect, send_no_error)\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/starlette/middleware/authentication.py\", line 49, in __call__\n    await self.app(scope, receive, send)\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/starlette/middleware/base.py\", line 189, in __call__\n    with collapse_excgroups():\n  File \"/home/shinya/.asdf/installs/python/3.10.14/lib/python3.10/contextlib.py\", line 153, in __exit__\n    self.gen.throw(typ, value, traceback)\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/starlette/_utils.py\", line 93, in collapse_excgroups\n    raise exc\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/starlette/middleware/base.py\", line 191, in __call__\n    response = await self.dispatch_func(request, call_next)\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/ai_gateway/api/middleware.py\", line 304, in dispatch\n    return await call_next(request)\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/starlette/middleware/base.py\", line 165, in call_next\n    raise app_exc\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/starlette/middleware/base.py\", line 151, in coro\n    await self.app(scope, receive_or_disconnect, send_no_error)\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/starlette/middleware/exceptions.py\", line 65, in __call__\n    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/starlette/_exception_handler.py\", line 64, in wrapped_app\n    raise exc\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/starlette/_exception_handler.py\", line 53, in wrapped_app\n    await app(scope, receive, sender)\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/starlette/routing.py\", line 756, in __call__\n    await self.middleware_stack(scope, receive, send)\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/starlette/routing.py\", line 776, in app\n    await route.handle(scope, receive, send)\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/starlette/routing.py\", line 297, in handle\n    await self.app(scope, receive, send)\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/starlette/routing.py\", line 77, in app\n    await wrap_app_handling_exceptions(app, request)(scope, receive, send)\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/starlette/_exception_handler.py\", line 64, in wrapped_app\n    raise exc\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/starlette/_exception_handler.py\", line 53, in wrapped_app\n    await app(scope, receive, sender)\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/starlette/routing.py\", line 72, in app\n    response = await func(request)\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/fastapi/routing.py\", line 278, in app\n    raw_response = await run_endpoint_function(\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/fastapi/routing.py\", line 191, in run_endpoint_function\n    return await dependant.call(**values)\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/ai_gateway/api/feature_category.py\", line 35, in wrapper\n    return await func(*args, **kwargs)\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/ai_gateway/api/v2/code/completions.py\", line 130, in completions\n    code_completions = completions_legacy_factory()\n  File \"src/dependency_injector/providers.pyx\", line 225, in dependency_injector.providers.Provider.__call__\n  File \"src/dependency_injector/providers.pyx\", line 2689, in dependency_injector.providers.Factory._provide\n  File \"src/dependency_injector/providers.pxd\", line 650, in dependency_injector.providers.__factory_call\n  File \"src/dependency_injector/providers.pxd\", line 577, in dependency_injector.providers.__call\n  File \"src/dependency_injector/providers.pxd\", line 445, in dependency_injector.providers.__provide_keyword_args\n  File \"src/dependency_injector/providers.pxd\", line 365, in dependency_injector.providers.__get_value\n  File \"src/dependency_injector/providers.pyx\", line 225, in dependency_injector.providers.Provider.__call__\n  File \"src/dependency_injector/providers.pyx\", line 2689, in dependency_injector.providers.Factory._provide\n  File \"src/dependency_injector/providers.pxd\", line 650, in dependency_injector.providers.__factory_call\n  File \"src/dependency_injector/providers.pxd\", line 577, in dependency_injector.providers.__call\n  File \"src/dependency_injector/providers.pxd\", line 445, in dependency_injector.providers.__provide_keyword_args\n  File \"src/dependency_injector/providers.pxd\", line 365, in dependency_injector.providers.__get_value\n  File \"src/dependency_injector/providers.pyx\", line 225, in dependency_injector.providers.Provider.__call__\n  File \"src/dependency_injector/providers.pyx\", line 2689, in dependency_injector.providers.Factory._provide\n  File \"src/dependency_injector/providers.pxd\", line 650, in dependency_injector.providers.__factory_call\n  File \"src/dependency_injector/providers.pxd\", line 608, in dependency_injector.providers.__call\n  File \"src/dependency_injector/providers.pyx\", line 811, in dependency_injector.providers.Dependency.__call__\n  File \"src/dependency_injector/providers.pyx\", line 811, in dependency_injector.providers.Dependency.__call__\n  File \"src/dependency_injector/providers.pyx\", line 225, in dependency_injector.providers.Provider.__call__\n  File \"src/dependency_injector/providers.pyx\", line 4246, in dependency_injector.providers.Selector._provide\n  File \"src/dependency_injector/providers.pyx\", line 225, in dependency_injector.providers.Provider.__call__\n  File \"src/dependency_injector/providers.pyx\", line 2689, in dependency_injector.providers.Factory._provide\n  File \"src/dependency_injector/providers.pxd\", line 650, in dependency_injector.providers.__factory_call\n  File \"src/dependency_injector/providers.pxd\", line 577, in dependency_injector.providers.__call\n  File \"src/dependency_injector/providers.pxd\", line 463, in dependency_injector.providers.__provide_keyword_args\n  File \"src/dependency_injector/providers.pxd\", line 365, in dependency_injector.providers.__get_value\n  File \"src/dependency_injector/providers.pyx\", line 225, in dependency_injector.providers.Provider.__call__\n  File \"src/dependency_injector/providers.pyx\", line 3049, in dependency_injector.providers.Singleton._provide\n  File \"src/dependency_injector/providers.pxd\", line 650, in dependency_injector.providers.__factory_call\n  File \"src/dependency_injector/providers.pxd\", line 608, in dependency_injector.providers.__call\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/ai_gateway/models/container.py\", line 31, in _init_vertex_grpc_client\n    return grpc_connect_vertex({\"api_endpoint\": endpoint})\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/ai_gateway/models/base.py\", line 107, in grpc_connect_vertex\n    return PredictionServiceAsyncClient(client_options=client_options)\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/google/cloud/aiplatform_v1/services/prediction_service/async_client.py\", line 274, in __init__\n    self._client = PredictionServiceClient(\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/google/cloud/aiplatform_v1/services/prediction_service/client.py\", line 708, in __init__\n    self._transport = transport_init(\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/google/cloud/aiplatform_v1/services/prediction_service/transports/grpc_asyncio.py\", line 203, in __init__\n    super().__init__(\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/google/cloud/aiplatform_v1/services/prediction_service/transports/base.py\", line 106, in __init__\n    credentials, _ = google.auth.default(\n  File \"/home/shinya/gitlab-development-kit/gitlab-ai-gateway/.venv/lib/python3.10/site-packages/google/auth/_default.py\", line 691, in default\n    raise exceptions.DefaultCredentialsError(_CLOUD_SDK_MISSING_CREDENTIALS)\ngoogle.auth.exceptions.DefaultCredentialsError: Your default credentials were not found. To set up Application Default Credentials, see https://cloud.google.com/docs/authentication/external/set-up-adc for more information.\n",
  "correlation_id": "897899436ac64ddbb128d37514ae4314",
  "extra": {},
  "logger": "exceptions",
  "level": "error",
  "type": "mlops",
  "stage": "main",
  "timestamp": "2024-07-12T05:07:46.960526Z",
  "message": "Your default credentials were not found. To set up Application Default Credentials, see https://cloud.google.com/docs/authentication/external/set-up-adc for more information."
}

Merge request checklist

  • Tests added for new functionality. If not, please raise an issue to follow up.
  • Documentation added/updated, if needed.
Edited by Sean Carroll

Merge request reports