Skip to content
Snippets Groups Projects
Verified Commit 66d074f3 authored by langurmonkey's avatar langurmonkey
Browse files

feat: improve RAG/LLM post.

parent 6a91ed0a
No related branches found
No related tags found
No related merge requests found
......@@ -56,7 +56,9 @@ mistral-small:24b 8039dd90c113 14 GB 7 weeks ago
phi4:latest ac896e5b8b34 9.1 GB 7 weeks ago
```
I would recommend using `llama3.1:8b` ([link](https://ai.meta.com/blog/meta-llama-3-1/)) for this experiment, especially if you run a machine with little RAM. It is quite compact and works reasonably well[^1]. Get it with:
I would recommend using `llama3.1:8b` ([link](https://ai.meta.com/blog/meta-llama-3-1/)) for this experiment, especially if you run a machine with little RAM. It is quite compact and works reasonably well[^1]. Additionally, some models like `gemma3:12b` do not support embeddings, so you need to choose carefully.
Get `llama3.1:8b` with:
```bash
ollama pull llama3.1:8b
......@@ -101,7 +103,10 @@ llm_model = input("Model to use: ")
## Vector storage
Now, we need to initialize the ChromaDB client with a persistent storage. Our vector storage will be in the directory `chroma_db/`.
Chroma DB is a vector database designed and optimized for storing and searching **vector embeddings**, which are crucial in RAG. Vector embeddings are numerical representations of data (text, images, audio, etc.). They capture the *meaning* or *semantic* information of the data. They are generated by machine learning models (often
transformer models like those used in LLMs). These models are trained to map similar data points to nearby points in a high-dimensional vector space. That's precisely what we need to do with our context data for it to be useful to the LLM. And to that purpose we use Chroma DB.
So, we need to initialize the Chroma DB client with a persistent storage, and get it ready to embed our context data. We decide to persist our storage to disk, in the directory `chroma_db/`.
```python
# Configure ChromaDB
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment