A/B/C Testing for Code Suggestion

Overview

As a follow-up post MVC we would like to actually do A/B/C testing and route traffic to X% traffic to each of the model is production and collect the telemtry to see the suggestion acceptance rate for the model prediction.

Gitlab Native
Text-bison-001
code-bison
code-gecko

We want to the analyze based on few metrics, the primary objective being suggestion acceptance rate.

Current state

We've implemented telemetry metrics on !159 (merged) (server side) and gitlab-org/gitlab-vscode-extension!825 (merged) (client side) through on HTTP headers, counting total requests, failed requests, and accepted suggestions. Model identification is not currently in the telemetry data.

Design

sequenceDiagram
    actor U as User
    participant C as Client
    participant M as Model Gateway

    U->>C: ...typing
    C->>M: Code Suggestion
    M->>C: Code Suggestion Response
    Note over C,M: engine: codegen, name: ensemble
    U->>C: Accepts suggestion
    U->>C: ...typing
    C->>M: Code Suggestion
    Note over C,M: engine: codegen, name: ensemble, accepts: 1
    Note right of M: Log accept count per model

To accommodate model data we're moving away from HTTP headers to including Telemetry data in the request and response bodies. We'll have the API server send the model name and engine to the VSCode client, have the client accumulate accepts/requests/errors counters based on this data on its side, and then send it back to the server for Prometheus metrics and logging. For this we'll add a model field in the API responses with model identifiers. See examples bellow:

{
  // ... other response fields
  "model": {
      "engine": "gitlab-native",
      "name": "codegen-v2-1.0.0"
  }
}

On the VSCode client we'll add an optional telemetry request field that corresponds to an array of telemetry counters tagged with model identifiers. See example below:

{
  // ... other request fields
  "telemetry": [{
      "model_engine": "gitlab-native",
      "model_name": "codegen-v2-1.0.0"
      "requests": 1,
      "accepts": 1, 
      "errors": 0
    }, {
      "model_engine": "code-bison",
      "model_name": "v2-1.0.0"
      "requests": 1,
      "accepts": 1, 
      "errors": 0
    }]
  }
}

Implementation

Send model data from codesuggestions API to client !183 (merged)
Consume client model data for telemetry !186 (merged) (backwards compatible with the headers format)
Have the VSCode extension send back model data on its telemetry gitlab-org/gitlab-vscode-extension!851 (merged) (should be deployed last)
Implement routing traffic for the divide, through projects which means one project will consistently receive completions from one model.

Edited Jul 18, 2023 by Alejandro Rodríguez