Skip to content

Allow vertex embeddings model to be selected

What does this MR do and why?

Part 1 of 4 of Vertex text embedding model discontinuation on ... (#521836 - closed):

  1. Allow model to be specified 👈 this MR
  2. Add text-embedding-005 model to Vertex supported models on AI Gateway: MR
  3. Add embedding_1 field to work items index: MR
  4. Use embedding_1 field and new model during indexing, backfill embedding_1 and switch queries to use embedding_1 and new model: MR

References

Vertex text embedding model discontinuation on ... (#521836 - closed)

https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api#model_versions

How to set up and validate locally

  1. Checkout gitlab-org/modelops/applied-ml/code-suggestions/ai-assist!2302 (merged) on AI Gateway
  2. Make sure AI Gateway is running
  3. Tail the AI Gateway logs: gdk tail gitlab-ai-gateway
  4. Generate embeddings without a model and note the URL used is POST /v1/proxy/vertex-ai/v1/projects/PROJECT/locations/LOCATION/publishers/google/models/textembedding-gecko%40003%3Apredict
Gitlab::Llm::VertexAi::Embeddings::Text.new("something", user: nil, tracking_context: {}, unit_primitive: 'semantic_search_issue').execute
  1. Generate embeddings with another model passed in and note the URL used is now POST /v1/proxy/vertex-ai/v1/projects/PROJECT/locations/LOCATION/publishers/google/models/text-embedding-005%3Apredict
Gitlab::Llm::VertexAi::Embeddings::Text.new("something", user: nil, tracking_context: {}, unit_primitive: 'semantic_search_issue', model: 'text-embedding-005').execute
  1. Generate embeddings with a non-existing model passed in and note that you get an error StandardError: Could not generate embedding: '{"detail":"Unsupported model"}'
Gitlab::Llm::VertexAi::Embeddings::Text.new("something", user: nil, tracking_context: {}, unit_primitive: 'semantic_search_issue', model: 'non-existing').execute

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #521836 (closed)

Edited by Madelein van Niekerk

Merge request reports

Loading