Skip to content

Add support for the embedding database

Matt Kasa requested to merge mattkasa-add-embedding-db into master

What does this MR do and why?

Adds support for a new embeddings database with a separate schema and migrations.

Relates to #404396

This is part of a series of MRs and should be reviewed and merged in this order

MR status
Add support for the embedding database (!118156 - merged) (db setup) in review
Add the tanuki_bot model (!118195 - merged) (migration) in review
Create initial Tanuki bot api endpoint (!117695 - merged) (api) in dev

Context

We need the pgvector extension installed and enabled for this new database. It is needed for AI experimentation features that require storing and searching embeddings. We don't want to make it a dependency for the main DB on GitLab.com and for self-managed customers.

How to set up and validate locally

  1. Check out the mattkasa-add-pgvector branch in your GDK root.
  2. Run gdk config set pgvector.enabled true.
  3. Run gdk reconfigure.
  4. Check out the mattkasa-add-embedding-db branch in your GDK root.
  5. Check out the mattkasa-add-embedding-db branch in gitlab. (Note: this is in the gitlab directory)
  6. Run gdk config set gitlab.rails.databases.embedding.enabled true.
  7. Run gdk reconfigure.
  8. Run bin/rails db:migrate.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Dmitry Gruzd

Merge request reports