Skip to content

feat: instrument all model requests

Bob Van Landuyt requested to merge bvl/instrument-inferences into main

feat: instrument all model requests

This reimplements some of the metrics that are currently in the TextGenModelInstrumentator from instrumentators/base.py but without coupling them to code-suggestions. We're supposed to use this instrumentator for all calls to models, external or otherwise to get metrics for them.

This adds 2 new metrics for inferences:

  • model_inferences_total: A counter that gets incremented for every inference.
  • inference_request_duration_seconds: A histogram observing the duration of a inference.

Both of these have 2 new labels:

  • error: yes or no. This will allow us to measure error rates, and exclude failing inferences from the apdex for services.
  • streaming: yesorno`. This will allow us to measure the apdex for streaming inferences differently.

This also implements the new error callback for streaming implementations (for now only Anthropic).

For #441 (closed)

Merge request reports

Loading