Instrument calls to models (!501) · Merge requests · GitLab.org / ModelOps / AI Assisted (formerly Applied ML) / Code Suggestions / AI Gateway

Bob Van Landuyt requested to merge bvl/track-concurrent-requests into main Dec 13, 2023

Instrument calls to models

This adds an instrumentator that can be called around requests to different model engines. The first metric implemented here is a gauge counting the number of requests in flight.

For regular calls the caller just needs to wrap the inference inside a ModelRequestInstrumentator.watch call.

The instrumentor supports also streaming calls to models: in this case, the caller is reponsible for calling finish() after the response is completely consumed.

For gitlab-com/runbooks#143 (closed)

Also the groundwork for #371 (closed)

Edited Dec 15, 2023 by Bob Van Landuyt

Instrument calls to models

Merge request reports