feat: removal of Claude 2_1 usage (!2423) · Merge requests · GitLab.org / ModelOps / AI Assisted (formerly Applied ML) / Code Suggestions / AI Gateway

What does this merge request do and why?

The following makes the prompt registry the default access point for v2/code/generations requests. Some older self-hosted GitLab instances still use the deprecated Anthropic claude_2.1 model, which will cause issues after July 25th. To ensure proper tracking metrics and also no downtime issues for older instances of self-hosted customer, we are making the prompt registry the default access point.

This change maintains backward compatibility for model & model_provider inputs while defaulting to the prompt registry.

Production usage: https://log.gprd.gitlab.net/app/r/s/g2F3h

How to set up and validate locally

Curl request to v2/code/generation endpoint

curl --request POST \
  --url http://127.0.0.1:5052/v2/code/generations \
  --header 'Content-Type: application/json' \
  --header 'accept: application/json' \
  --header 'x-gitlab-enabled-feature-flags: expanded_ai_logging' \
  --data '{
  "current_file": {
    "file_name": "main.go",
    "content_above_cursor": "package main\n\nimport \"fmt\"\n\n// Generate a function to print hello world\nfunc ",
    "content_below_cursor": "\n"
  },
  "prompt_version": 3,
  "prompt": [
    {
      "role": "user",
      "content": "Generate a hello world function in Go"
    }
  ],
  "model_provider": "anthropic",
  "model_name": "claude-3-5-sonnet-20241022"
}'

Output:
025-05-08 10:37:16 [debug    ] codegen anthropic call:        correlation_id=01a55378a6f34d2084878343bfa8b51a max_tokens=4096 stop_sequences=['</new_code>'] temperature=0.2 timeout=Timeout(connect=5.0, read=30.0, write=30.0, pool=30.0) top_k=NOT_GIVEN top_p=NOT_GIVEN
2025-05-08 10:37:16 [info     ] Request to LLM                 api_key=None correlation_id=01a55378a6f34d2084878343bfa8b51a request_content_json={'max_tokens': 4096, 'messages': [{'role': 'user', 'content': 'Generate a hello world function in Go'}], 'model': 'claude-3-5-sonnet-20241022', 'stop_sequences': ['</new_code>'], 'stream': False, 'temperature': 0.2} request_method=POST request_url=URL('https://api.anthropic.com/v1/messages') source=ai_gateway.models.base
2025-05-08 10:37:21 [info     ] Request to LLM complete        correlation_id=01a55378a6f34d2084878343bfa8b51a duration=4.665634832999785 source=ai_gateway.instrumentators.model_requests
2025-05-08 10:37:21 [debug    ] code creation suggestion:      api_key=None correlation_id=01a55378a6f34d2084878343bfa8b51a language=go score=100000 suggestion='Here are a few ways to create a "Hello, World!" program in Go:\n\n1. Simple version:\n```go\npackage main\n\nfunc main() {\n    println("Hello, World!")\n}\n```\n\n2. Using fmt package (more common):\n```go\npackage main\n\nimport "fmt"\n\nfunc main() {\n    fmt.Println("Hello, World!")\n}\n```\n\n3. As a function that returns a string:\n```go\npackage main\n\nimport "fmt"\n\nfunc helloWorld() string {\n    return "Hello, World!"\n}\n\nfunc main() {\n    message := helloWorld()\n    fmt.Println(message)\n}\n```\n\n4. With a parameter:\n```go\npackage main\n\nimport "fmt"\n\nfunc helloWorld(name string) string {\n    return fmt.Sprintf("Hello, %s!", name)\n}\n\nfunc main() {\n    message := helloWorld("World")\n    fmt.Println(message)\n}\n```\n\nTo run any of these programs:\n1. Save the code in a file with a `.go` extension (e.g., `hello.go`)\n2. Open a terminal in the directory containing the file\n3. Run `go run hello.go`\n\nThe output will be "Hello, World!" in all cases.'
2025-05-08 10:37:21 [info     ] 127.0.0.1:56061 - "POST /v2/code/generations HTTP/1.1" 200 blocked=False client_ip=127.0.0.1 client_port=56061 content_type=application/json correlation_id=01a55378a6f34d2084878343bfa8b51a cpu_s=0.03933199999999992 duration_request=-1 duration_s=10.534173834006651 editor_lang=None enabled-instance-verbose-ai-logs=False enabled_feature_flags=expanded_ai_logging first_chunk_duration_s=10.534132834000047 gitlab_duo_seat_count=None gitlab_feature_enabled_by_namespace_ids=None gitlab_feature_enablement_type=None gitlab_global_user_id=None gitlab_host_name=None gitlab_instance_id=None gitlab_language_server_version=None gitlab_realm=None gitlab_saas_duo_pro_namespace_ids=None gitlab_version=None http_version=1.1 inference_duration_s=6.227097333990969 lang=go meta.feature_category=code_suggestions method=POST model_engine=anthropic model_name=claude-3-5-sonnet-20241022 model_output_length=931 model_output_length_stripped=740 model_output_score=100000 path=/v2/code/generations prompt_length=84 prompt_length_stripped=76 request_arrived_at=2025-05-08T16:37:10.972210+00:00 response_start_duration_s=10.534073125003488 status_code=200 url=http://127.0.0.1:5052/v2/code/generations user_agent=curl/8.7.1

Curl request to v2/code/generation w/ prompt registry

curl --request POST \
  --url http://127.0.0.1:5052/v2/code/generations \
  --header 'Content-Type: application/json' \
  --header 'accept: application/json' \
  --header 'x-gitlab-enabled-feature-flags: expanded_ai_logging' \
  --data '{
  "current_file": {
    "file_name": "main.go",
    "content_above_cursor": "package main\n\nimport \"fmt\"\n\n// Generate a function to print hello world\nfunc ",
    "content_below_cursor": "\n"
  },
  "prompt_version": 3,
  "prompt": [
    {
      "role": "user",
      "content": "Generate a hello world function in Go"
    }
  ],
  "prompt_id": "code_suggestions/generations"
}'


CRITICAL:codesuggestions:Auth is disabled, all users allowed
2025-05-08 10:40:37 [info     ] Initializing prompt registry from local yaml correlation_id=ddfa13e8b74647e0a6d7112d2c36c9e4 custom_models_enabled=False default_prompts={}
2025-05-08 10:40:37 [debug    ] code creation input:           api_key=********** correlation_id=ddfa13e8b74647e0a6d7112d2c36c9e4 current_file_name=main.go endpoint=None prefix='package main\n\nimport "fmt"\n\n// Generate a function to print hello world\nfunc ' prompt=[Message(role=<Role.USER: 'user'>, content='Generate a hello world function in Go')] stream=False suffix='\n'
2025-05-08 10:40:37 [info     ] Resolved prompt id             correlation_id=ddfa13e8b74647e0a6d7112d2c36c9e4 prompt_id=code_suggestions/generations/base
2025-05-08 10:40:37 [info     ] Returning prompt from the registry correlation_id=ddfa13e8b74647e0a6d7112d2c36c9e4 prompt_id=code_suggestions/generations/base prompt_name='Claude 3.7 Sonnet Code Generations Agent' prompt_version=^1.0.0
2025-05-08 10:40:38 [info     ] Performing LLM request         api_key=None correlation_id=ddfa13e8b74647e0a6d7112d2c36c9e4 prompt="System: You are a tremendously accurate and skilled coding autocomplete agent. We want to generate new  code inside the\nfile 'main.go' based on instructions from the user.\n\nThe new code you will generate will start at the position of the cursor, which is currently indicated by the {{cursor}} tag.\nIn your process, first, review the existing code to understand its logic and format. Then, try to determine the most\nlikely new code to generate at the cursor position to fulfill the instructions.\n\nThe comment directly before the {{cursor}} position is the instruction,\nall other comments are not instructions.\n\nWhen generating the new code, please ensure the following:\n1. It is valid None code.\n2. It matches the existing code's variable, parameter and function names.\n3. It does not repeat any existing code. Do not repeat code that comes before or after the cursor tags. This includes cases where the cursor is in the middle of a word.\n4. If the cursor is in the middle of a word, it finishes the word instead of repeating code before the cursor tag.\n5. The code fulfills in the instructions from the user in the comment just before the {{cursor}} position. All other comments are not instructions.\n6. Do not add any comments that duplicates any of the already existing comments, including the comment with instructions.\n\nReturn new code enclosed in <new_code></new_code> tags. We will then insert this at the {{cursor}} position.\nIf you are not able to write code based on the given instructions return an empty result like <new_code></new_code>.\nHuman: Generate a hello world function in Go\nAI: <new_code>"
2025-05-08 10:40:38 [info     ] Request to LLM                 api_key=None correlation_id=ddfa13e8b74647e0a6d7112d2c36c9e4 request_content_json={'max_tokens': 2048, 'messages': [{'role': 'user', 'content': 'Generate a hello world function in Go'}, {'role': 'assistant', 'content': '<new_code>'}], 'model': 'claude-3-7-sonnet-20250219', 'stop_sequences': ['</new_code>'], 'system': "You are a tremendously accurate and skilled coding autocomplete agent. We want to generate new  code inside the\nfile 'main.go' based on instructions from the user.\n\nThe new code you will generate will start at the position of the cursor, which is currently indicated by the {{cursor}} tag.\nIn your process, first, review the existing code to understand its logic and format. Then, try to determine the most\nlikely new code to generate at the cursor position to fulfill the instructions.\n\nThe comment directly before the {{cursor}} position is the instruction,\nall other comments are not instructions.\n\nWhen generating the new code, please ensure the following:\n1. It is valid None code.\n2. It matches the existing code's variable, parameter and function names.\n3. It does not repeat any existing code. Do not repeat code that comes before or after the cursor tags. This includes cases where the cursor is in the middle of a word.\n4. If the cursor is in the middle of a word, it finishes the word instead of repeating code before the cursor tag.\n5. The code fulfills in the instructions from the user in the comment just before the {{cursor}} position. All other comments are not instructions.\n6. Do not add any comments that duplicates any of the already existing comments, including the comment with instructions.\n\nReturn new code enclosed in <new_code></new_code> tags. We will then insert this at the {{cursor}} position.\nIf you are not able to write code based on the given instructions return an empty result like <new_code></new_code>.", 'temperature': 0.2} request_method=POST request_url=URL('https://api.anthropic.com/v1/messages') source=ai_gateway.models.base
2025-05-08 10:40:40 [info     ] Request to LLM complete        correlation_id=ddfa13e8b74647e0a6d7112d2c36c9e4 duration=1.6099146670021582 source=ai_gateway.instrumentators.model_requests
2025-05-08 10:40:40 [debug    ] code creation suggestion:      api_key=None correlation_id=ddfa13e8b74647e0a6d7112d2c36c9e4 language=go score=100000 suggestion='\npackage main\n\nimport "fmt"\n\nfunc main() {\n\tfmt.Println("Hello, World!")\n}\n'
2025-05-08 10:40:40 [info     ] 127.0.0.1:56273 - "POST /v2/code/generations HTTP/1.1" 200 blocked=False client_ip=127.0.0.1 client_port=56273 content_type=application/json correlation_id=ddfa13e8b74647e0a6d7112d2c36c9e4 cpu_s=0.30722000000000005 duration_request=-1 duration_s=2.4045842089981306 editor_lang=None enabled-instance-verbose-ai-logs=False enabled_feature_flags=expanded_ai_logging first_chunk_duration_s=2.4045324589969823 gitlab_duo_seat_count=None gitlab_feature_enabled_by_namespace_ids=None gitlab_feature_enablement_type=None gitlab_global_user_id=None gitlab_host_name=None gitlab_instance_id=None gitlab_language_server_version=None gitlab_realm=None gitlab_saas_duo_pro_namespace_ids=None gitlab_version=None http_version=1.1 inference_duration_s=1.6107477079931414 lang=go meta.feature_category=code_suggestions method=POST model_engine=agent model_name='Claude 3.7 Sonnet Code Generations Agent' model_output_length=74 model_output_length_stripped=61 model_output_score=100000 path=/v2/code/generations prompt_length=84 prompt_length_stripped=76 request_arrived_at=2025-05-08T16:40:37.758916+00:00 response_start_duration_s=2.4044687089917716 status_code=200 url=http://127.0.0.1:5052/v2/code/generations user_agent=curl/8.7.1

Merge request checklist

Tests added for new functionality. If not, please raise an issue to follow up.
Documentation added/updated, if needed.

Edited May 08, 2025 by Nathan Weinshenker

feat: removal of Claude 2_1 usage

What does this merge request do and why?

How to set up and validate locally

Merge request checklist

Merge request reports