Skip to content

Log details of model requests

Shinya Maeda requested to merge instrument-model-metrics into main

What does this merge request do and why?

We're currently investigating a production issue that Duo Chat is not working with AI Gateway. We've confirmed that the GitLab-Rails can request to the AI Gateway, however, AI Gateway returns an error prompt must end with "\n\nAssistant:" turn, which indicates that the prompt is malformed.

It's odd that the same request on staging environment doesn't encounter the same error. For the further investigation, we need to know the actual prompt is passed to 3rd party model in AI Gateway.

Note that user prompt is already logged in GitLab-Rails, which you can find in Kibana example, however, we should log it in AI Gateway for the SSOT. See https://gitlab.com/gitlab-org/gitlab/-/issues/432688+ and https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/issues/357+ for more information.

This is part of effort for Improve observability of Duo Chat with v1/agent... (#371 - closed)

How to set up and validate locally

Visit OpenAPI playground http://0.0.0.0:5052/docs. Here is sample logs:

v1/chat/agent

{
    "prompt": "\n\nHuman: Hi, How are you?\n\nAssistant:",
    "correlation_id": "766d7a32db0949ea834d95e1b7e59f02",
    "model_options": {
        "timeout": "Timeout(connect=5.0, read=30.0, write=30.0, pool=30.0)",
        "max_tokens_to_sample": 2048,
        "stop_sequences": [
            "\n\nHuman",
            "Observation:"
        ],
        "temperature": 0.2,
        "top_k": "NOT_GIVEN",
        "top_p": "NOT_GIVEN"
    },
    "model_engine": "anthropic",
    "model_name": "claude-2.1",
    "logger": "model_requests",
    "level": "info",
    "type": "mlops",
    "stage": "main",
    "timestamp": "2023-12-19T09:52:12.642660Z",
    "message": "requesting to a model"
}

v2/code/completions

{
    "prompt": {
        "prefix": "This code has a filename of string\nprint",
        "suffix": ""
    },
    "correlation_id": "956f44cdeba94df8962f4795e6a31de5",
    "model_options": {
        "temperature": 0.2,
        "maxOutputTokens": 64,
        "topP": 0.95,
        "topK": 40,
        "stopSequences": [
            "\n\n"
        ]
    },
    "model_engine": "vertex-ai",
    "model_name": "code-gecko@002",
    "logger": "model_requests",
    "level": "info",
    "type": "mlops",
    "stage": "main",
    "timestamp": "2023-12-19T09:51:06.892946Z",
    "message": "requesting to a model"
}

Merge request checklist

  • Tests added for new functionality. If not, please raise an issue to follow up.
  • Documentation added/updated, if needed.
Edited by Shinya Maeda

Merge request reports