Document serving LLMs with Ollama
What does this MR do and why?
The update documents how to serve large language models (LLMs) locally. It recommends different serving frameworks and highlights Ollama, an open-source framework for running LLMs locally. The setup process for Ollama is explained, including installing, pulling a specific model, and starting the server. An example of using the Ollama API is provided for reference. Additionally, instructions are given for resolving port conflicts if necessary.
Related to #455309 (closed)