Generalize ingestion process for any public data
Problem to solve
In Solution implementation for "users can ask docu... (gitlab-org/gitlab#451215 - closed) and Connect a service to Vertex AI Search (gitlab-com/gl-infra/scalability#3453 - closed), we introduced Vertex AI Search Agent Builder API for GitLab documentations. Semantic search for this data is available in POST /v1/search/gitlab-docs
API.
We can add more public data to enhance the GitLab AI-powered features. Such as:
- Construct GitLab Rest APIs ... Search relevant APIs for user's NL query and construct Rest API query with LLM.
- Construct GitLab GrahQL APIs ... Search relevant APIs for user's NL query and construct GraphQL API query with LLM.
- Pick relevant agent tools ... Select relevant tools from thousands of agent tools.
Proposal
Generalize ingestion process documented in https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/blob/vertex-ai-search-ingestion/docs/search.md.
Links / references
Edited by Shinya Maeda