Support for multiple embedding models
Proposal
We have a need to configure and roll out embeddings with differing dimensions. This will be useful in the following cases:
- Air-gapped customers who want to run their own embedding model
- Experimenting with and changing to a new model
The challenge is that the stored embeddings and request embeddings (e.g. from a question) must use the same model.
We need some way to keep track of the model in use - this must be used by the AI Gateway to determine which model to use to generate embeddings and it must be set by rails. Maybe it's a database table:
Table name: EmbeddingVersion
id
type: code|text
dimensions
name
model
status: in_progress|active|deprecated- Embedding requests to the AIGW takes EmbeddingVersion.active.for_type(type).modelas a param.
- kNN searches query the  "embedding_#{EmbeddingVersion.active.for_type(type).id}"field.
- The index's mapping should be called "embedding_#{EmbeddingVersion.active.for_type(type).id}"and its dimension should be"embedding_#{EmbeddingVersion.active.for_type(type).dimensions}"
We might need a UI component in which an administrator can set settings for the models used for embeddings on an instance. Once we know the model and its settings, we can create the mappings and start the embedding indexing process.
The idea is that this kicks off a process:
- Create new EmbeddingVersionrecord with statusin_progress
- Update mapping in ES by adding a new field with record.dimensionsdims called"embedding_#{record.id}"
- Kick off process similar to Backfill embeddings for gitlab project (#456918 - closed) to generate embeddings with new model. The AIGW request should take record.modelas a param.
- Once the backfill is complete, mark record as activeand previous active record asdeprecated.
- Kick off process to remove deprecatedembedding field from Elasticsearch.
Out of scope: supporting different embedding models per namespace. We will assume instance-wide settings.
