Skip to content

Track code suggestion events with Snowplow

Andras Herczeg requested to merge 192-track-snowplow-events into main

What does this MR do and why?

This MR allows us to create a Snowplow event for each incoming code suggestion request and to forward it to the Snowplow server.

How to validate and test locally

We need to setup Snowplow Micro locally to instropect the event emitted from Model Gateway.

  1. Clone Snowplow Micro repo
  2. Run ./snowplow-micro.sh
  3. In another terminal session, build and run a Model Gateway container
    $ docker buildx build --platform linux/amd64 -t code-suggestions-api:dev .
    $ docker run --platform linux/amd64 --rm -p 5999:5052 \
        -e AUTH_BYPASS_EXTERNAL=true -e SNOWPLOW_ENABLED=true -e SNOWPLOW_ENDPOINT=http://host.docker.internal:9090 \
        -v $PWD:/app -it code-suggestions-api:dev
  4. Make a cURL request to the suggestion endpoint
    curl --request POST \
      --url http://codesuggestions.gdk.test:5999/v2/completions \
      --header 'Content-Type: application/json' \
      --header 'X-Gitlab-Global-User-Id: 2' \
      --header 'X-Gitlab-Realm: saas' \
      --header 'authorization: Bearer <PAT>' \
      --header 'runway: true' \
      --data '{
      "prompt_version": 1,
      "project_path": "gitlab-org/gitlab",
      "project_id": 278964,
      "current_file": {
        "file_name": "main.py",
        "content_above_cursor": "def sum(a, b):",
        "content_below_cursor": "\n    return a + b\n\ndef subtract(a, b):\n    return a - b\n"
      },
      "telemetry": [
        {
          "model_engine": "vertex-ai",
          "model_name": "code-gecko",
          "lang": "python",
          "requests": 1,
          "accepts": 1,
          "errors": 0
        }
      ]
    }'
  5. Verify the events from Snowplow micro via http://localhost:9090/micro/good
    {
      "api": {
        "vendor": "com.snowplowanalytics.snowplow",
        "version": "tp2"
      },
      "parameters": {
        "e": "se",
        "eid": "986e8615-b45c-4fb1-904b-0bb9df094f47",
        "aid": "gitlab_ai_gateway",
        "cx": "eyJzY2hlbWEiOiAiaWdsdTpjb20uc25vd3Bsb3dhbmFseXRpY3Muc25vd3Bsb3cvY29udGV4dHMvanNvbnNjaGVtYS8xLTAtMSIsICJkYXRhIjogW3sic2NoZW1hIjogImlnbHU6Y29tLmdpdGxhYi9jb2RlX3N1Z2dlc3Rpb25zX2NvbnRleHQvanNvbnNjaGVtYS8xLTAtMCIsICJkYXRhIjogeyJyZXF1ZXN0X2NvdW50cyI6IFt7InJlcXVlc3RzIjogMSwgImVycm9ycyI6IDAsICJhY2NlcHRzIjogMSwgImxhbmciOiAicHl0aG9uIiwgIm1vZGVsX2VuZ2luZSI6ICJ2ZXJ0ZXgtYWkiLCAibW9kZWxfbmFtZSI6ICJjb2RlLWdlY2tvIn1dLCAicHJlZml4X2xlbmd0aCI6IDE1LCAic3VmZml4X2xlbmd0aCI6IDU2LCAibGFuZ3VhZ2UiOiAicHl0aG9uIiwgInVzZXJfYWdlbnQiOiAiaW5zb21uaWEvMjAyMy40LjAiLCAiZ2l0bGFiX3JlYWxtIjogIiJ9fV19",
        "tna": "gl",
        "stm": "1692282235000",
        "tv": "py-1.0.1",
        "se_ac": "suggestions_requested",
        "se_ca": "code_suggestions",
        "p": "pc",
        "dtm": "1692282235536"
      },
      "contentType": "application/json",
      "source": {
        "name": "snowplow-micro-1.7.2-stdout$",
        "encoding": "UTF-8",
        "hostname": "host.docker.internal"
      },
      "context": {
        "timestamp": "2023-08-17T14:23:55.562Z",
        "ipAddress": "172.17.0.1",
        "useragent": "python-requests/2.31.0",
        "refererUri": null,
        "headers": [
          "Timeout-Access: <function1>",
          "Host: host.docker.internal:9090",
          "User-Agent: python-requests/2.31.0",
          "Accept-Encoding: gzip, deflate",
          "Accept: */*",
          "Connection: keep-alive",
          "application/json"
        ],
        "userId": "0cdeb3ff-6f94-4b7c-b549-f54589df7343"
      }
    }
  6. The context payload cx can be Base64-decoded
    echo -n "eyJzY2hlbWEiOiAiaWdsdTpjb20uc25vd3Bsb3dhbmFseXRpY3Muc25vd3Bsb3cvY29udGV4dHMvanNvbnNjaGVtYS8xLTAtMSIsICJkYXRhIjogW3sic2NoZW1hIjogImlnbHU6Y29tLmdpdGxhYi9jb2RlX3N1Z2dlc3Rpb25zX2NvbnRleHQvanNvbnNjaGVtYS8xLTAtMCIsICJkYXRhIjogeyJyZXF1ZXN0X2NvdW50cyI6IFt7InJlcXVlc3RzIjogMSwgImVycm9ycyI6IDAsICJhY2NlcHRzIjogMSwgImxhbmciOiAicHl0aG9uIiwgIm1vZGVsX2VuZ2luZSI6ICJ2ZXJ0ZXgtYWkiLCAibW9kZWxfbmFtZSI6ICJjb2RlLWdlY2tvIn1dLCAicHJlZml4X2xlbmd0aCI6IDE1LCAic3VmZml4X2xlbmd0aCI6IDU2LCAibGFuZ3VhZ2UiOiAicHl0aG9uIiwgInVzZXJfYWdlbnQiOiAiaW5zb21uaWEvMjAyMy40LjAiLCAiZ2l0bGFiX3JlYWxtIjogIiJ9fV19" | base64 -d
    {"schema": "iglu:com.snowplowanalytics.snowplow/contexts/jsonschema/1-0-1", "data": [{"schema": "iglu:com.gitlab/code_suggestions_context/jsonschema/1-0-0", "data": {"request_counts": [{"requests": 1, "errors": 0, "accepts": 1, "lang": "python", "model_engine": "vertex-ai", "model_name": "code-gecko"}], "prefix_length": 15, "suffix_length": 56, "language": "python", "user_agent": "insomnia/2023.4.0", "gitlab_realm": ""}}]}

Relates to #192 (closed)

Edited by Tan Le

Merge request reports