Consolidate variable expansions on prompt templates
Problem to solve
We have some prompt templates under the prompts/definitions directory, and use two different variable expansion methods.
-
Jinja: This is a powerful template library that allows us to script inside the template e.g.
ifandforinside a template. - LangChain's
PromptTemplate: This is a straight forward variable expansion like Python's native formatter.
What's confusing here is that we're mixing two different templating styles. Let's see the ReAct template:
prompt_template:
system: |
{chat_history}
...
You have access only to the following tools:
<tools_list>
{%- for tool in tools %}
<tool>
<name>{{ tool.name }}</name>
<description>
{{ tool.description }}
</description>
{%- if tool.example %}
<example>
{{ tool.example }}
</example>
{%- endif %}
</tool>
{%- endfor %}
</tools_list>
...
{%- for tool in tools -%}
{{ tool.name }}
{%- if not loop.last %}, {% endif %}
{%- endfor -%}
]
...
{%- if current_file %}
{%- if current_file.selected_code %}
User selected code below enclosed in <code></code> tags in file {{ current_file.file_path }} to work with:
<code>
{{ current_file.data }}
</code>
{%- else %}
The current code file that user sees is {{ current_file.file_path }} and has the following content:
<content>
{{ current_file.data }}
</content>
{%- endif %}
{%- endif %}
...
{%- for tool in tools -%}
{% if tool.resource -%}
{{ tool.resource }}
{%- if not loop.last %}, {% endif %}
{%- endif %}
{%- endfor -%}.
...
{{context_content}}
Begin!
user: |
Question: {question}
assistant: |
{agent_scratchpad}
This is already cryptic, but simply put:
- The double curly bracket
{{ var }}and scripting (e.g.{%- if current_file %}) belongs to Jinja. The variable expansion happens before the chain executions. - The single curly bracket
{var}belongs to LangChain. The variable expansion happens during the chain executions.
Bugs on production
By this reason, if input arguments to Jinja's contains {var}, that will be required by LangChain PromptTemplate as input arguments, which caused the following bug https://gitlab.slack.com/archives/C06LWENL58F/p1723679062434439:
raise KeyError(
KeyError: "Input to ChatPromptTemplate is missing variables {''}. Expected: ['', 'agent_scratchpad', 'chat_history', 'question'] Received: ['chat_history', 'question', 'agent_scratchpad', 'current_file']"
This bug can be reproduced by inserting {} in the Jinja's variable expansion. For example,
curl -X 'POST' \
'http://localhost:5052/v2/chat/agent' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"prompt": "Hi, how are you?",
"options": {
"chat_history": "string",
"agent_scratchpad": {
"agent_type": "react",
"steps": [
{
"thought": "string",
"tool": "string",
"tool_input": "string",
"observation": "string"
}
]
},
"context": {
"type": "issue",
"content": "string"
},
"current_file": {
"file_path": "main.py",
"data": "{}",
"selected_code": true
}
}
}'
Proposal
- Consolidate variable expansions on prompt templates. Don't mix
{}and{{}}. - Use Jinja as it's more flexible with the scripting. LangChain's
PromptTemplatedoesn't support partial inclusion.- By the same reason, don't introduce almost duplicate templates like
base.ymlandwith_mr_support /base.yml, as they are quickly out of sync. This can be easily solved with Jinja's conditional clause and partial inclusion i.e.if.
- By the same reason, don't introduce almost duplicate templates like
Alternative ideas:
- Escape the curly brackets in Jinja. This is quick and dirty since we're introducing an unnecessary dependency between Jinja and LangChain.
Further details
Links / references
Edited by Shinya Maeda