Validate and Regulate the response format from the completion services
Problem
We recently had the SLO violation on llm_completion
metric because one of the completion services changed its response format. We temporarily fixed it by adjusting the service class itself (See !136404 (merged)), however, we could unknowingly change the response format in the future, which triggers the same incident.
Proposal
Regulate the response from the completion services. One of the approaches is to enforce ServiceResponse
.