register aggregated suggestions to `register_model_output_length`
Problem
Currently, each suggestion overwrites the dict attribute, hence only the last suggestion is counted as token in the Prometheus instrumentator.
Proposal
register aggregated suggestions to register_model_output_length
Auto-gen
The following discussion from !822 (merged) should be addressed:
-
@shinya.maeda started a discussion: Suggestion: Similarly to the token count aggregation for the
tokens_per_user_request_response
, we should register aggregated suggestions to this method. Currently, each suggestion overwrites the dict attribute, hence only the last suggestion is counted as token in the Prometheus instrumentator.@jfypk Would you mind following-up this part in a separate MR?