Document Hardware Requirements for Self-Hosting Our Supported Models

Summary

Document hardware requirements for customers who wish to self-host AI models. This should include recommendations for CPU, GPU, memory, and storage.

Background

Related to: Internal slack discussion and Internal slack discussion

Customers frequently request guidance on the hardware specifications needed to self-host our models effectively. As domain experts, we must provide clear, actionable recommendations to help them deploy our models in their environments with optimal performance.

Inference matrix for reference

Defenition of done

  • A detailed guidance document outlining the hardware requirements for self-hosting our supported models - with a specific focus on vLLM and the Mistral series (for GA)
  • Updated documentation within the relevant sections of our customer-facing docs.
Edited by Susie Bitters