Introduce Claude 3.5 Haiku in Gitlab-Rails
What does this MR do and why?
This merge request introduces support for the CLAUDE 3.5 HAIKU model in our Duo Chat tools, enhancing our AI capabilities with a more efficient and cost-effective option.
Note The following model updates is placed around a feature flag to enable us to evaluate the efficacy of the model in staging. The following should allows us to perform evaluations in staging to measure improvements to Duo Chat tool within our daily-evals
Implementation Details
- Added support for CLAUDE 3.5 HAIKU in the model selection logic
- Implemented a feature flag
claude_3_5_haiku_rollout
for controlled rollout - Updated the
model
method to handle the new model option
References
Please include cross links to any resources that are relevant to this MR This will give reviewers and future readers helpful context to give an efficient review of the changes introduced.
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
Example Screenshots of LangSmith Trace of a few tools using CLAUDE_3_HAIKU previously
Tool | LangSmith Trace |
---|---|
issue_reader | |
epic_reader | |
merge_reader |
Before | After |
---|---|
How to set up and validate locally
- Enable the Feature flag
claude_3_5_haiku_rollout
Feature.enable(:claude_3_5_haiku_rollout, user)
- Run the
issue_reader
tool in Duo Chat to see new model used
Example log
024-11-08_17:19:18.36304 gitlab-ai-gateway : 2024-11-08 18:19:18 [info ] 127.0.0.1:52495 - "POST /v2/chat/agent HTTP/1.1" 200 client_ip=127.0.0.1 client_port=52495 content_type=application/x-ndjson; charset=utf-8 correlation_id=01JC6CK3PVQJ3HK2V16KE68NME cpu_s=0.027322999999999986 duo_chat.agent_available_tools=['epic_reader', 'issue_reader', 'merge_request_reader', 'ci_editor_assistant', 'gitlab_documentation'] duo_chat.agent_tool_action=issue_reader duration_request=0.006379127502441406 duration_s=1.9949571670149453 enabled_feature_flags=expanded_ai_logging first_chunk_duration_s=1.9941708749975078 gitlab_duo_seat_count=100 gitlab_feature_enabled_by_namespace_ids= gitlab_global_user_id=hxOxGNEDbyD789OGQ3GnDe85HzwdQxebqTqjQJQ5jzs= gitlab_host_name=127.0.0.1 gitlab_instance_id=046c9556-092c-4c75-a42d-231b15cf19f2 gitlab_language_server_version=None gitlab_realm=saas gitlab_saas_duo_pro_namespace_ids=None gitlab_version=17.6.0 http_version=1.1 meta.feature_category=duo_chat method=POST path=/v2/chat/agent request_arrived_at=2024-11-08T17:19:16.367799+00:00 response_start_duration_s=0.0024274999741464853 status_code=200 tracked_internal_events=['request_ask_issue', 'request_duo_chat'] url=http://0.0.0.0:5052/v2/chat/agent user_agent=Ruby
2024-11-08_17:19:18.38984 gitlab-ai-gateway : CRITICAL:codesuggestions:Auth is disabled, all users allowed
2024-11-08_17:19:18.39047 gitlab-ai-gateway : 2024-11-08 18:19:18 [debug ] codegen anthropic call: correlation_id=01JC6CK3PVQJ3HK2V16KE68NME max_tokens=4096 stop_sequences=['\n\nHuman', 'Observation:'] temperature=0.1 timeout=Timeout(connect=5.0, read=30.0, write=30.0, pool=30.0) top_k=NOT_GIVEN top_p=NOT_GIVEN
2024-11-08_17:19:18.39114 gitlab-ai-gateway : 2024-11-08 18:19:18 [info ] Request to LLM correlation_id=01JC6CK3PVQJ3HK2V16KE68NME request_content_json={'max_tokens': 4096, 'messages': [{'role': 'user', 'content': 'Please identify the author of #42 issue'}, {'role': 'assistant', 'content': 'The user is asking about the author of issue #42, and they are currently viewing an issue page. I should use the issue_reader tool to get more information about this specific issue.\n```json\n {\n "ResourceIdentifierType": "'}], 'model': 'claude-3-5-haiku-20241022', 'stop_sequences': ['\n\nHuman', 'Observation:'], 'stream': True, 'system': 'You can fetch information about a resource called: an issue.\nAn issue can be referenced by url or numeric IDs preceded by symbol.\nAn issue can also be referenced by a GitLab reference. A GitLab reference ends with a number preceded by the delimiter # and contains one or more /.\nResourceIdentifierType can only be one of [current, iid, url, reference].\nResourceIdentifier can be number, url. If ResourceIdentifier is not a number or a url, use "current".\nWhen you see a GitLab reference, ResourceIdentifierType should be reference.\n\nMake sure the response is a valid JSON. The answer should be just the JSON without any other commentary!\nReferences in the given question to the current issue can be also for example "this issue" or "that issue",\nreferencing the issue that the user currently sees.\nQuestion: (the user question)\nResponse (follow the exact JSON response):\n```json\n{\n "ResourceIdentifierType": <ResourceIdentifierType>\n "ResourceIdentifier": <ResourceIdentifier>\n}\n```\n\nExamples of issue reference identifier:\n\nQuestion: The user question or request may include https://some.host.name/some/long/path/-/issues/410692\nResponse:\n```json\n{\n "ResourceIdentifierType": "url",\n "ResourceIdentifier": "https://some.host.name/some/long/path/-/issues/410692"\n}\n```\n\nQuestion: the user question or request may include: #12312312\nResponse:\n```json\n{\n "ResourceIdentifierType": "iid",\n "ResourceIdentifier": 12312312\n}\n```\n\nQuestion: the user question or request may include long/groups/path#12312312\nResponse:\n```json\n{\n "ResourceIdentifierType": "reference",\n "ResourceIdentifier": "long/groups/path#12312312"\n}\n```\n\nQuestion: Summarize the current issue\nResponse:\n```json\n{\n "ResourceIdentifierType": "current",\n "ResourceIdentifier": "current"\n}\n```\n\nBegin!\n', 'temperature': 0.1} request_method=POST request_url=URL('https://api.anthropic.com/v1/messages') source=ai_gateway.models.base
Numbered steps to set up and validate the change are strongly suggested.