add failover solution for code completion (!171210) · Merge requests · GitLab.org / GitLab · GitLab

What does this MR do and why?

related MR gitlab-org/modelops/applied-ml/code-suggestions/ai-assist!1564 (diffs)

related issue: #498549 (closed)

We want to have a failover solution for code completion. But as of now, there's only one model provider: vertexAI.

Vertex AI: code-gecko002, codestral@2405 as the primary provider

Because of this, I added a second model provider. As a backup, claude model provided by anthropic model provider will be used in case vertex is down.

This MR:

Added a feature flag so that we can switch the code completion provider to a backup in case the primary provider is down.
Added anthropic as a secondary provider.

How to switch from primary to secondary provider

enable the feature flag, eg

        if Feature.enabled?(:incident_fail_over_completion_provider, current_user)
          # claude hosted on anthropic
          CodeSuggestions::Prompts::CodeCompletion::Anthropic.new(params)

How test locally

checkout this MR and this MR
before switching, code completion is using codestral
switch.
[2] pry(main)> Feature.enable(:incident_fail_over_completion_provider)
trigger code completion like:

code completion works as expected.

Why using anthropic for code completion

self hosted customer already using this model: https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/blob/main/ai_gateway/prompts/definitions/code_suggestions/completions/claude_3.yml
this is a failover solution, so it should be rarely used. As long as the performance is acceptable, I think switching should be safe.

Limitations and Future work

as mentioned in the issue. #498549 (closed). :

eventually we'll migrate to v3, the ff won't work after the migration
we can't control failover for self-managed instance, the customers themselves will need to do it.
ff adds complexity to the already bloated if else logic in v2
using .env (instead of ff) align with the existing practice of fallover, eg, if custom-model-team update/improve the existing process of fallover, the code completion won't be included in this update/improve

So I suggest we eventually use .env to control the failover in v3.

We can't do this now, because both codestral and anthropic(here) are not in v3 yet. And codestral is not supported by prompt registry yet.

future work:

migrate codestral from v2 to v3, and using prompt registry
migrate anthropic from v2 to v3, and using prompt registry
add switch solution to v3 code completion

Edited Nov 01, 2024 by Tian Gao