Skip to content

Add OpenAI GPT OSS Model on Evaluation Runner & Run Evals

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Release Note

description: You can now use more supported models with GitLab Duo Self-Hosted, to include open source (OS) OpenAI GPT models and Anthropic Claude 4. OpenAI GPT OSS 20B and 120B are supported for use with GitLab Duo Self-Hosted on vLLM, Azure OpenAI, and AWS Bedrock. Claude 4 is supported on AWS Bedrock. Provide feedback on these models in issues #560016 and #550190 (closed).

documentation: 'https://docs.gitlab.com/administration/gitlab_duo_self_hosted/supported_models_and_hardware_requirements/'

Details

This issue is to add the newly released OpenAI GPS OS models to Evaluation Runner.

Models

License

  • Apache 2.0

Definition of Done

  • The above GPT OS models have been added to evaluation runner
  • Gitlab Developers can run evaluations on their features against the GPT OS models
  • Each model has been run against available Evaluation datasets in ER
  • The following identified bug has been addressed/remediated - #563341
  • The traffic light system for self-hosted models has been updated to include scores, and the documentation has been updated to reflect any changes
Edited by Susie Bitters