add failover solution for code completion
What does this MR do and why?
related MR gitlab-org/modelops/applied-ml/code-suggestions/ai-assist!1564 (diffs)
related issue: #498549 (closed)
We want to have a failover solution for code completion. But as of now, there's only one model provider: vertexAI.
- Vertex AI:
code-gecko002
,codestral@2405
as the primary provider
Because of this, I added a second model provider. As a backup, claude model provided by anthropic model provider will be used in case vertex is down.
This MR:
-
Added a feature flag so that we can switch the code completion provider to a backup in case the primary provider is down.
-
Added anthropic as a secondary provider.
How to switch from primary to secondary provider
enable the feature flag, eg
if Feature.enabled?(:incident_fail_over_completion_provider, current_user)
# claude hosted on anthropic
CodeSuggestions::Prompts::CodeCompletion::Anthropic.new(params)
How test locally
- checkout this MR and this MR
- before switching, code completion is using codestral
- switch.
[2] pry(main)> Feature.enable(:incident_fail_over_completion_provider)
- trigger code completion like:
code completion works as expected.
Why using anthropic for code completion
- self hosted customer already using this model: https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/blob/main/ai_gateway/prompts/definitions/code_suggestions/completions/claude_3.yml
- this is a failover solution, so it should be rarely used. As long as the performance is acceptable, I think switching should be safe.
Limitations and Future work
as mentioned in the issue. #498549 (closed). :
- eventually we'll migrate to v3, the ff won't work after the migration
- we can't control failover for self-managed instance, the customers themselves will need to do it.
- ff adds complexity to the already bloated if else logic in v2
- using .env (instead of ff) align with the existing practice of fallover, eg, if custom-model-team update/improve the existing process of fallover, the code completion won't be included in this update/improve
So I suggest we eventually use .env to control the failover in v3.
We can't do this now, because both codestral and anthropic(here) are not in v3 yet. And codestral is not supported by prompt registry yet.
future work:
- migrate codestral from v2 to v3, and using prompt registry
- migrate anthropic from v2 to v3, and using prompt registry
- add switch solution to v3 code completion