Skip to content

Add Llama 4 Maverick to Evaluation Runner

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

In order to support the validation of Self-Hosted models on Evaluation Runner as part of Self-Hosted platformization, we need to enable feature teams to test supported and relevant models.

Meta Llama 4 Maverick is a possible candidate within our supported model families for quality performance with Agentic Flows, with a score of 53% on SWE-Bench Verified. (For context, Claude Opus 4 got a 72% score on SWE-bench while Claude Haiku got a 40% score).

Llama 4 Maverick is released under the LLAMA 4 COMMUNITY LICENSE AGREEMENT - Meta license

Llama 4 Maverick is available on Fireworks.ai.

Model

Supported platforms
  • vLLM
  • Fireworks

Definition of Done

Edited by Susie Bitters