[Code Embeddings] Allow SM instances with self-hosted AIGW to select their own embeddings model
## Context
Some Self-Managed instances have their own [self-hosted AIGW](https://docs.gitlab.com/administration/gitlab_duo_self_hosted/), where the organization wants to choose their own embeddings model.
However, the Code Embeddings Indexing pipeline only supports the Vertex AI `text-embedding-005` model.
## Solution
1. For SM instances, do not automatically set Vertex AI `text-embedding-005` as the embeddings model.
- https://gitlab.com/gitlab-org/gitlab/-/work_items/582637+
- https://gitlab.com/gitlab-org/gitlab/-/work_items/582635+
- https://gitlab.com/gitlab-org/gitlab/-/work_items/582638+
- https://gitlab.com/gitlab-org/gitlab/-/work_items/582647+
1. Ensure that the Code Embeddings Indexing pipeline and Semantic Search tool are not enabled if the embeddings model is not set
- https://gitlab.com/gitlab-org/gitlab/-/work_items/582786+
1. Support new embeddings model _only_ when requested by the customer, and the model passes our evaluation process.
- see https://gitlab.com/groups/gitlab-org/-/epics/20110#note_2987689509 for a step-by-step guide on supporting a new embeddings model
## BACKGROUND: Problem and Proposed Solutions
For some Self-Managed instances, it is possible that they will have their own [self-hosted AIGW](https://docs.gitlab.com/administration/gitlab_duo_self_hosted/) where the organization will select their own embeddings generation models.
Currently, the Code Embeddings Indexing pipeline only supports the Vertex AI `text-embedding-005` model.
Supporting other models presents multiple challenges:
1. The [ActiveContext gem expects a hard-coded list of models](https://gitlab.com/gitlab-org/gitlab/-/blob/3377d9cd8d423a0f99dfe2cc4cc91a0223f39844/gems/gitlab-active-context/lib/active_context/concerns/collection.rb#L72). Currently, this hard-coded list is defined in [`Ai::ActiveContext::Collections::Code::MODELS`](https://gitlab.com/gitlab-org/gitlab/-/blob/a634d17d55e958f24138a083e411c0665f26010b/ee/lib/ai/active_context/collections/code.rb#L16).
- Proposed Solution: this should be in a configurable setting so that SM users can change it without asking us to update the code.
2. For now, the only model provider we have verified to be supported by AIGW is Vertex AI. We are proxying the embeddings generation request through the `/v1/proxy/vertex-ai` endpoint.
- AIGW has a proxy for Anthropic models, but Anthropic does not provide an embeddings model
- AIGW has a proxy for OpenAI, which provides an embeddings model, but we will need to verify that we can support this with gitlab~29783025 and/or ~"group::custom models"
- For embeddings models hosted by other providers (e.g. Fireworks), we may need to add implementation in AIGW to support them
- This is going to be a case-to-case basis depending on the Self-Managed customer
3. On Rails, we only support `text-embedding-005` model. If the customer wants to use a Vertex AI `gemini-embedding-001`, this is not possible on Rails even if it is already possible on AIGW
- Proposed solution: Add classes for every embeddings model we need to support. Similar to the AIGW changes, this will have to be a case-to-case basis depending on the Self-Managed customer.
### Priority / Order of solution implementation
For Item 1 above: this is something we can already do now.
For Items 2 and 3: as discussed, **this will have to be done on a case-to-case basis depending on the customer**. It doesn't make sense to implement something if the customer doesn't need it.
epic