Skip to content

Draft: POC repo embeddings

Jan Provaznik requested to merge jp-repo-embed into master

What does this MR do and why?

A simple POC which creates embeddings for a repo and then finds related files (epic &14107)

Use rails console to use it:

# generate embeddings for files in repo
CodeSuggestions::RepoEmbeddings.new(User.first).create_embeddings('/opt/gitlab-development-kit/ai-assist')

# find related files for a user's question
CodeSuggestions::RepoEmbeddings.new(User.first).related_for_questions

# find files related to another file
CodeSuggestions::RepoEmbeddings.new(User.first).related_for_files

Embeddings search seem to be promising for user's instructions (below for each question is nested list of top 5 related files for AI gateway project - https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist):

v2 endpoint for chat which accepts additional context
  ai_gateway/api/v1/chat/typing.py
  ai_gateway/api/v2/__init__.py
  ai_gateway/api/v1/chat/agent.py
  ai_gateway/api/v2/chat/agent.py
  ai_gateway/models/v2/anthropic_claude.py

check that JWT was issued by AI gateway
  ai_gateway/self_signed_jwt/token_authority.py
  docs/auth.md
  ai_gateway/auth/providers.py
  ai_gateway/self_signed_jwt/__init__.py
  scripts/tests/e2e_code_suggestions.py

authenticate user using JWT
  docs/auth.md
  ai_gateway/self_signed_jwt/token_authority.py
  ai_gateway/api/v1/code/user_access_token.py
  ai_gateway/self_signed_jwt/container.py
  ai_gateway/auth/providers.py

Where do we define completions API endpoint?
  ai_gateway/api/v2/__init__.py
  ai_gateway/api/v3/code/typing.py
  ai_gateway/api/v2/code/completions.py
  ai_gateway/api/v3/code/completions.py
  docs/api.md

Where do we authenticate user?
  docs/auth.md
  scripts/tests/e2e_code_suggestions.py
  ai_gateway/auth/user.py
  tests/code_suggestions/test_authentication.py
  conftest.py

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

Edited by Jan Provaznik

Merge request reports