A/B/C Testing for Code Suggestion
Overview
As a follow-up post MVC we would like to actually do A/B/C testing and route traffic to X% traffic to each of the model is production and collect the telemtry to see the suggestion acceptance rate for the model prediction.
- Gitlab Native
- Text-bison-001
- code-bison
- code-gecko
We want to the analyze based on few metrics, the primary objective being suggestion acceptance rate.
Current state
We've implemented telemetry metrics on !159 (merged) (server side) and gitlab-org/gitlab-vscode-extension!825 (merged) (client side) through on HTTP headers, counting total requests, failed requests, and accepted suggestions. Model identification is not currently in the telemetry data.
Design
sequenceDiagram
actor U as User
participant C as Client
participant M as Model Gateway
U->>C: ...typing
C->>M: Code Suggestion
M->>C: Code Suggestion Response
Note over C,M: engine: codegen, name: ensemble
U->>C: Accepts suggestion
U->>C: ...typing
C->>M: Code Suggestion
Note over C,M: engine: codegen, name: ensemble, accepts: 1
Note right of M: Log accept count per model
To accommodate model data we're moving away from HTTP headers to including Telemetry data in the request and response bodies. We'll have the API server send the model name and engine to the VSCode client, have the client accumulate accepts/requests/errors counters based on this data on its side, and then send it back to the server for Prometheus metrics and logging. For this we'll add a model
field in the API responses with model identifiers. See examples bellow:
{
// ... other response fields
"model": {
"engine": "gitlab-native",
"name": "codegen-v2-1.0.0"
}
}
On the VSCode client we'll add an optional telemetry
request field that corresponds to an array of telemetry counters tagged with model identifiers. See example below:
{
// ... other request fields
"telemetry": [{
"model_engine": "gitlab-native",
"model_name": "codegen-v2-1.0.0"
"requests": 1,
"accepts": 1,
"errors": 0
}, {
"model_engine": "code-bison",
"model_name": "v2-1.0.0"
"requests": 1,
"accepts": 1,
"errors": 0
}]
}
}
Implementation
-
Send model data from codesuggestions API to client !183 (merged) -
Consume client model data for telemetry !186 (merged) (backwards compatible with the headers format) -
Have the VSCode extension send back model data on its telemetry gitlab-org/gitlab-vscode-extension!851 (merged) (should be deployed last) -
Implement routing traffic for the divide, through projects which means one project will consistently receive completions from one model.