Skip to content

Change the TanukiBot's distance function

What does this MR do and why?

This is the first MR for https://gitlab.com/gitlab-org/gitlab/-/issues/410581+.

It changes TanukiBot's distance function from inner_product to cosine per OpenAI docs recommendation.

Follow-up MR: Add index to embeddings (!122035 - closed)

Screenshots or screen recordings

After running the following commands, we got these results:

current_user = User.first; client = ::Gitlab::Llm::OpenAi::Client.new(current_user); question = 'What is Fork?'; embeddings_result = client.embeddings(input: question); question_embedding = embeddings_result['data'].first['embedding']; 
Embedding::TanukiBotMvc.neighbor_for(question_embedding, limit: 7).pluck(:id)
Before After
Using inner_product as distance function Using cosine as distance function
[6909, 12665, 6913, 6910, 7125, 6912, 6825] [6909, 12665, 6913, 6910, 7125, 6912, 6825]

How to set up and validate locally

N/A

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Bojan Marjanovic

Merge request reports