Skip to content

build: add script for ingestion to vertex ai search

Shinya Maeda requested to merge vertex-ai-search-ingestion into main

Note: This is a high priority MR for Solution implementation for "users can ask docu... (gitlab-org/gitlab#451215 - closed) in %17.0

What does this merge request do and why?

This MR introduces a script make ingest to ingest and refresh GitLab Documentations served by Vertex AI Search (Agent Builder). This data will be used for documentation tool of Duo Chat.

See the doc for more information.

How to set up and validate locally

See Ingest GitLab documentations locally and Test search app in GCP console sections. We'll add an endpoint later in AI Gateway (example).

Here is a test run result:

make ingest > ingest.log 2>&1

ingest.log

  • Execution date: Wed May 1 02:00:24 AM UTC 2024
  • Execution SHA: ac69a3c5 (latest feature branch)

Further reading

We'll Generalize ingestion process for any public data (#446) in the future. For the sake of high priority of docs support and Keep it simple principle, this MR is tailored for gitlab docs.

We're also working on daily data refreshment in CI/CD pipelines in ci: ingest gitlab docs in pipeline schedules (!774 - merged).

Merge request checklist

Edited by Shinya Maeda

Merge request reports