feat: instrument all model requests (!763) · Merge requests · GitLab.org / ModelOps / AI Assisted (formerly Applied ML) / Code Suggestions / AI Gateway

Bob Van Landuyt requested to merge bvl/instrument-inferences into main Apr 26, 2024

feat: instrument all model requests

This reimplements some of the metrics that are currently in the TextGenModelInstrumentator from instrumentators/base.py but without coupling them to code-suggestions. We're supposed to use this instrumentator for all calls to models, external or otherwise to get metrics for them.

This adds 2 new metrics for inferences:

model_inferences_total: A counter that gets incremented for every inference.
inference_request_duration_seconds: A histogram observing the duration of a inference.

Both of these have 2 new labels:

error: yes or no. This will allow us to measure error rates, and exclude failing inferences from the apdex for services.
streaming: yesorno`. This will allow us to measure the apdex for streaming inferences differently.

This also implements the new error callback for streaming implementations (for now only Anthropic).

For #441 (closed)

feat: instrument all model requests

Merge request reports