Ignore context if it is not provided (!451) · Merge requests · GitLab.org / ModelOps / AI Model Validation and Research / AI Evaluation / Prompt Library

Tan Le requested to merge fix-no-context-template into main May 14, 2024

What does this merge request do and why?

Ignore context if it is not provided

This fixes issue where answering model got hung up to the empty context which results in low quality response.

How to set up and validate locally

Ensure GCP environment variables are setup.

Run a Duo Chat eval on the code generation dataset using the following config.

{
  "beam_config": {
    "pipeline_options": {
      "runner": "DirectRunner",
      "project": "dev-ai-research-0e2f8974",
      "region": "us-central1",
      "temp_location": "gs://prompt-library/tmp/",
      "save_main_session": false
    }
  },
  "input_source": {
    "type": "bigquery",
    "path": "dev-ai-research-0e2f8974.duo_chat.sampled_code_generation_v1"
  },
  "output_sinks": [
    {
      "type": "local",
      "path": "data/output",
      "prefix": "experiment_demo"
    }
  ],
  "throttle_sec": 1,
  "batch_size": 20,
  "eval_setup": {
    "answering_models": [
      {
        "name": "duo-chat",
        "parameters": {
          "base_url": "http://gdk.test:8080"
        },
        "prompt_template_config": {
          "templates": [
            {
              "name": "empty",
              "template_path": "data/prompts/duo_chat/answering/empty.txt.example"
            }
          ]
        }
      },
      {
        "name": "claude-2",
        "prompt_template_config": {
          "templates": [
            {
              "name": "baseline template",
              "template_path": "data/prompts/duo_chat/answering/claude-2.txt.example"
            }
          ]
        }
      }
    ],
    "metrics": [
      {
        "metric": "independent_llm_judge",
        "evaluating_models": [
          {
            "name": "text-bison-32k@latest",
            "prompt_template_config": {
              "templates": [
                {
                  "name": "claude-2",
                  "template_path": "data/prompts/duo_chat/evaluating/claude-2.txt.example"
                }
              ]
            }
          }
        ]
      },
      {
        "metric": "similarity_score"
      }
    ]
  }
}

Check the output to confirm the text Unfortunately, there is no context provided to answer in the responses.
```
yq -oy '.[] | .answer_from_comparison_model' data/output/experiment_demo_20240514_194311__similarity_score-00000-of-00001.csv  | rg "Unfortunately.*context" -c
18 
```
- experiment_demo_20240514_194311__similarity_score-00000-of-00001.csv
- experiment_demo_20240514_194311__independent_llm_judge-00000-of-00001.csv
Check out this merge request's branch.
Run the same pipeline, now with the updated template
Check the output to confirm the text Unfortunately, there is no context provided to answer no longer in the responses.
```
yq -oy '.[] | .answer_from_comparison_model' data/output/experiment_demo_20240514_194910__similarity_score-00000-of-00001.csv  | rg "Unfortunately.*context" -c
```
- experiment_demo_20240514_194910__similarity_score-00000-of-00001.csv
- experiment_demo_20240514_194910__independent_llm_judge-00000-of-00001.csv

Merge request checklist

I've ran the affected pipeline(s) to validate that nothing is broken.
Tests added for new functionality. If not, please raise an issue to follow up.
Documentation added/updated, if needed.

Edited May 14, 2024 by Tan Le

Ignore context if it is not provided

What does this merge request do and why?

How to set up and validate locally

Merge request checklist

Merge request reports