Document Hardware Requirements for Self-Hosting Our Supported Models
Summary
Document hardware requirements for customers who wish to self-host AI models. This should include recommendations for CPU, GPU, memory, and storage.
Background
Related to: Internal slack discussion and Internal slack discussion
Customers frequently request guidance on the hardware specifications needed to self-host our models effectively. As domain experts, we must provide clear, actionable recommendations to help them deploy our models in their environments with optimal performance.
Inference matrix for reference
Defenition of done
- A detailed guidance document outlining the hardware requirements for self-hosting our supported models - with a specific focus on vLLM and the Mistral series (for GA)
- Updated documentation within the relevant sections of our customer-facing docs.
Edited by Susie Bitters