PoC: Embedding storage and AI in Elastic

This issue captures findings from a proof of concept to explore the use of Elasticsearch (or OpenSearch) to store embeddings that are generated by a model on ES (instead of on postgres) and generating an answer to a question by using ES exclusively. In other words, replace the back-and-forth process Duo Chat follows to answer questions about gitlab docs.

Given the architecture used by Duo chat currently:

Elastic can help with the following use-cases:

Example:

The AI Gateway can potentially be used to serve models to .com, dedicated instances and SM instances.

Edited Feb 01, 2024 by Madelein van Niekerk