Codestral for Chat

This issue is to evaluate all Chat use cases with Codestral models

Model

Platforms

The problem we are currently facing is described in the following support ticket:https://support.gitlab.com/hc/en-us/requests/638594
The problem is on the AI-Gateway side. It's receiving an error because of a message sequence that the model doesn't understand. I have attached the content of the exchange and the error logs for your reference, which I have also tested with models hosted by Mistral.
I spoke with the Mistral team, and if I understood them correctly, they believe the issue is that the gateway finishes the conversation with the assistant role, when it should be using the user role for the last message. While I don't have deep expertise in these subjects, my own research supports this conclusion, as finishing with the assistant role is not supported by Mistral models.
feedback from Mistral: "We tried using Codestral as a chat model with Randstad, using the "custom" model template. I debugged the LLM call, and the fix is actually quite simple to implement if you want to support Codestral: just add a prefix: true in the body of the last assistant message. Our API requires this if you'd like to prefill the assistant's answer with specific tokens."

Codestral can be used to support all GA Chat feature on all supported platforms
Examine individual inputs and outputs that scored poorly (1-2 scores); Look for and document any patterns of either poor feature performance or poor LLM judge callibration. Iterate on the model prompt to eradicate patterns of poor performance.
Achieve less than 20% poor answers (defined as 1s and 2s from an LLM judge, or less than 0.8 cosine similarity) using each supported model for those areas in which we do have supporting validation datasets.
The traffic light system for self-hosted models has been updated to include quality scores, and the documentation has been updated to reflect any changes
- the workbook for ER showing those scores has been linked either here or in a comment

Edited Oct 17, 2025 by 🤖 GitLab Bot 🤖