Remove latency from output to fix error
What does this merge request do and why?
The run must have a single output, we were adding in a latency output as well. This MR removes that to fix the below error:
0it [00:00, ?it/s]Error running evaluator <DynamicRunEvaluator evaluate> on run 3d67ead0-5959-4f65-adee-44170699c1a3: ValueError('Evaluator distance=<StringDistance.LEVENSHTEIN: \'levenshtein\'> only supports a single prediction key. Please ensure that the run has a single output. Or initialize with a prepare_data:\n\ndef prepare_data(run, example):\n return {\n "prediction": run.outputs[\'my_output\'],\n "reference": example.outputs[\'expected\']\n }\nevaluator = LangChainStringEvaluator(..., prepare_data=prepare_data)\n')
Traceback (most recent call last):
File "/Users/missydavies/src/eli5/eli5/codesuggestions/.venv/lib/python3.12/site-packages/langsmith/evaluation/_runner.py", line 1233, in _run_evaluators
evaluator_response = evaluator.evaluate_run(
^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/missydavies/src/eli5/eli5/codesuggestions/.venv/lib/python3.12/site-packages/langsmith/evaluation/evaluator.py", line 278, in evaluate_run
result = self.func(
^^^^^^^^^^
File "/Users/missydavies/src/eli5/eli5/codesuggestions/.venv/lib/python3.12/site-packages/langsmith/run_helpers.py", line 568, in wrapper
raise e
File "/Users/missydavies/src/eli5/eli5/codesuggestions/.venv/lib/python3.12/site-packages/langsmith/run_helpers.py", line 565, in wrapper
function_result = run_container["context"].run(func, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/missydavies/src/eli5/eli5/codesuggestions/.venv/lib/python3.12/site-packages/langsmith/evaluation/integrations/_langchain.py", line 253, in evaluate
prepare_evaluator_inputs(run, example)
File "/Users/missydavies/src/eli5/eli5/codesuggestions/.venv/lib/python3.12/site-packages/langsmith/run_helpers.py", line 568, in wrapper
raise e
File "/Users/missydavies/src/eli5/eli5/codesuggestions/.venv/lib/python3.12/site-packages/langsmith/run_helpers.py", line 565, in wrapper
function_result = run_container["context"].run(func, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/missydavies/src/eli5/eli5/codesuggestions/.venv/lib/python3.12/site-packages/langsmith/evaluation/integrations/_langchain.py", line 201, in prepare_evaluator_inputs
raise ValueError(
ValueError: Evaluator distance=<StringDistance.LEVENSHTEIN: 'levenshtein'> only supports a single prediction key. Please ensure that the run has a single output. Or initialize with a prepare_data:
def prepare_data(run, example):
return {
"prediction": run.outputs['my_output'],
"reference": example.outputs['expected']
}
evaluator = LangChainStringEvaluator(..., prepare_data=prepare_data)
How to set up and validate locally
poetry run eli5 code-suggestions evaluate \
--dataset="code-suggestions-input-testcases-v1" \
--source=gitlab \
--limit=1 \
--offset=5 \
--evaluate-with-llm \
--experiment-prefix=exp \
--rate-limit=100
Merge request checklist
-
Tests added for new functionality. If not, please raise an issue to follow up. -
Documentation added/updated, if needed.