Strip whitespaces including new line in prompt (!137365) · Merge requests · GitLab.org / GitLab

Tan Le requested to merge strip-newline-code-generation-template into master Nov 20, 2023

What does this MR do and why?

This fixes an issue in code generations task with Anthropic models. Trailing new line characters in the prompt template makes the model assume it is already included in the original code and hence do not include a leading new line in the output. We attempted to fix this in the AI Gateway as part of the original issue gitlab-org/modelops/applied-ml/code-suggestions/ai-assist#339 (closed). However, this has results in a regressions gitlab-org/editor-extensions/gitlab-lsp#61 (closed) where newlines are included in cases where it should not.

This change ensures whitespaces including new line in prompt are stripped so the model will return the correct output. This removes the needs to always prepending newlines to the output in AI Gateway (MR). This MR can be deployed independent of gitlab-org/modelops/applied-ml/code-suggestions/ai-assist!464 (merged).

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before	After

How to set up and validate locally

Via Anthropic API

We can observe this bug directly via Anthropic API using a generated prompt template from GitLab Rails API

Before this MR

curl --request POST \
  --url https://api.anthropic.com/v1/complete \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: <API_KEY>' \
  --data '{
	"model": "claude-2",
	"prompt": "Human: You are a coding autocomplete agent. We want to generate new Python code inside the\nfile '\''hello.py'\'' based on instructions from the user.\nThe existing code is provided in <existing_code></existing_code> tags.\nThe new code you will generate will start at the position of the cursor, which is currently indicated by the <cursor> XML tag.\nIn your process, first, review the existing code to understand its logic and format. Then, try to determine the most\nlikely new code to generate at the cursor position to fulfill the instructions.\n\nWhen generating the new code, please ensure the following:\n1. It is valid Python code.\n2. It matches the existing code'\''s variable, parameter and function names.\n3. It does not repeat any existing code. Do not repeat code that comes before or after the cursor tags. This includes cases where the cursor is in the middle of a word.\n\nReturn new code enclosed in <new_code></new_code> tags. We will then insert this at the <cursor> position.\nPlease consider indentation and new line correctly.\nIf you are not able to write code based on the given instructions return an empty result like <new_code></new_code>.\n\nHere are a few examples of successfully generated code by other autocomplete agents:\n\n<examples>\n<example>\n<existing_code>\nclass Project:\n  def __init__(self, name, public):\n    self.name = name\n    self.visibility = '\''PUBLIC'\'' if public\n\n  # is this project public?\n<cursor>\n\n    # print name of this project\n       </existing_code>\n\n<new_code>  def is_public(self):\n    self.visibility == '\''PUBLIC'\''</new_code>\n</example>\n\n<example>\n<existing_code>\n# get the current user'\''s name from the session data\ndef get_user(session):\n<cursor>\n\n# is the current user an admin\n  </existing_code>\n\n<new_code>username = None\nif '\''username'\'' in session:\n  username = session['\''username'\'']\nreturn username</new_code>\n</example>\n</examples>\n\n\n<existing_code>\ndef foo:\n  # generate a function for the square root<cursor></existing_code>\n\n\nHere are instructions provided in <instruction></instruction> tags.\n\n<instruction>\ngenerate a function for the square root\n</instruction>\n\n\nAssistant: <new_code>",
	"max_tokens_to_sample": 30,
	"stop_sequences": ["</new_code>", "\n\nHuman:"]
}'

{
  "completion": "\ndef square_root(x):\n  return x ** 0.5\n",
  "stop_reason": "stop_sequence",
  "model": "claude-2.0",
  "truncated": false,
  "stop": "</new_code>",
  "log_id": "0df502e64f211c14c5c224922e92dfd4002affff3174deef6ff6b1c8024691e9",
  "exception": null
}

After this MR

curl --request POST \
  --url https://api.anthropic.com/v1/complete \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: <API_KEY>' \
  --data '{
	"model": "claude-2",
	"prompt": "Human: You are a coding autocomplete agent. We want to generate new Python code inside the\nfile '\''hello.py'\'' based on instructions from the user.\nThe existing code is provided in <existing_code></existing_code> tags.\nThe new code you will generate will start at the position of the cursor, which is currently indicated by the <cursor> XML tag.\nIn your process, first, review the existing code to understand its logic and format. Then, try to determine the most\nlikely new code to generate at the cursor position to fulfill the instructions.\n\nWhen generating the new code, please ensure the following:\n1. It is valid Python code.\n2. It matches the existing code'\''s variable, parameter and function names.\n3. It does not repeat any existing code. Do not repeat code that comes before or after the cursor tags. This includes cases where the cursor is in the middle of a word.\n\nReturn new code enclosed in <new_code></new_code> tags. We will then insert this at the <cursor> position.\nPlease consider indentation and new line correctly.\nIf you are not able to write code based on the given instructions return an empty result like <new_code></new_code>.\n\nHere are a few examples of successfully generated code by other autocomplete agents:\n\n<examples>\n<example>\n<existing_code>\nclass Project:\n  def __init__(self, name, public):\n    self.name = name\n    self.visibility = '\''PUBLIC'\'' if public\n\n  # is this project public?\n<cursor>\n\n    # print name of this project\n       </existing_code>\n\n<new_code>  def is_public(self):\n    self.visibility == '\''PUBLIC'\''</new_code>\n</example>\n\n<example>\n<existing_code>\n# get the current user'\''s name from the session data\ndef get_user(session):\n<cursor>\n\n# is the current user an admin\n  </existing_code>\n\n<new_code>username = None\nif '\''username'\'' in session:\n  username = session['\''username'\'']\nreturn username</new_code>\n</example>\n</examples>\n\n\n<existing_code>\ndef foo:\n  # generate a function for the square root<cursor></existing_code>\n\n\nHere are instructions provided in <instruction></instruction> tags.\n\n<instruction>\ngenerate a function for the square root\n</instruction>\n\n\nAssistant: <new_code>\n",
	"max_tokens_to_sample": 30,
	"stop_sequences": ["</new_code>", "\n\nHuman:"]
}'

{
  "completion": "import math\n\ndef square_root(x):\n  return math.sqrt(x)\n",
  "stop_reason": "stop_sequence",
  "model": "claude-2.0",
  "truncated": false,
  "stop": "</new_code>",
  "log_id": "544a47632a2ab1dabb8325e2a79796d666a75a8f4f27e40dd690cab8cb290f35",
  "exception": null
}

The full stack

We can verify this with the whole GitLab Rails <-> AI Gateway <-> Anthropic API setup.

Run a local instance of AI Gateway with Anthropic API
Set up AI Gateway integration with GDK.

# Add ANTHROPIC_API_KEY to `.env` file

# Build the container
docker buildx build --platform linux/amd64 \
  -t ai-gateway:dev .

# Run the container
docker run --platform linux/amd64 --rm \
  -p 5999:5052 \
  -e AUTH_BYPASS_EXTERNAL=true \
  -v $PWD:/app -it ai-gateway:dev

Update GitLab to point to the local AI Gateway

diff --git ee/lib/code_suggestions/tasks/base.rb ee/lib/code_suggestions/tasks/base.rb
index b63de2ba4c05..0097b90f64b1 100644
--- ee/lib/code_suggestions/tasks/base.rb
+++ ee/lib/code_suggestions/tasks/base.rb
@@ -3,7 +3,7 @@
 module CodeSuggestions
   module Tasks
     class Base
-      DEFAULT_CODE_SUGGESTIONS_URL = 'https://codesuggestions.gitlab.com'
+      DEFAULT_CODE_SUGGESTIONS_URL = 'http://codesuggestions.gdk.test:5999'
       AI_GATEWAY_CONTENT_SIZE = 100_000
 
       def initialize(params: {}, unsafe_passthrough_params: {})

Send a cURL request to Code Suggestion Rails API (use a PAT with api scope).

curl --request POST \
  --url http://gdk.test:8080/api/v4/code_suggestions/completions \
  --header 'Content-Type: application/json' \
  --header 'authorization: Bearer <PAT>' \
  --data '{
    "prompt_version": 2,
    "project_path": "awesome_project",
    "project_id": 278964,
    "current_file": {
      "file_name": "hello.go",
      "content_above_cursor": "// Generate a function to print hello world\nfunc print",
      "content_below_cursor": ""
    }
  }'

Ensure the output does not include a preceding newline

{
  "id": "id",
  "model": {
    "engine": "anthropic",
    "name": "claude-2",
    "lang": "go"
  },
  "experiments": [],
  "object": "text_completion",
  "created": 1700462919,
  "choices": [
    {
      "text": "Hello() {\n  fmt.Println(\"Hello World!\")\n}\n",
      "index": 0,
      "finish_reason": "length"
    }
  ]
}

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

I have evaluated the MR acceptance checklist for this MR.

Edited Nov 20, 2023 by Tan Le

Strip whitespaces including new line in prompt