Skip to content

Ignore context if it is not provided

Tan Le requested to merge fix-no-context-template into main

What does this merge request do and why?

Ignore context if it is not provided

This fixes issue where answering model got hung up to the empty context which results in low quality response.

Resolves #186 (closed)

How to set up and validate locally

  1. Ensure GCP environment variables are setup.

  2. Run a Duo Chat eval on the code generation dataset using the following config.

    {
      "beam_config": {
        "pipeline_options": {
          "runner": "DirectRunner",
          "project": "dev-ai-research-0e2f8974",
          "region": "us-central1",
          "temp_location": "gs://prompt-library/tmp/",
          "save_main_session": false
        }
      },
      "input_source": {
        "type": "bigquery",
        "path": "dev-ai-research-0e2f8974.duo_chat.sampled_code_generation_v1"
      },
      "output_sinks": [
        {
          "type": "local",
          "path": "data/output",
          "prefix": "experiment_demo"
        }
      ],
      "throttle_sec": 1,
      "batch_size": 20,
      "eval_setup": {
        "answering_models": [
          {
            "name": "duo-chat",
            "parameters": {
              "base_url": "http://gdk.test:8080"
            },
            "prompt_template_config": {
              "templates": [
                {
                  "name": "empty",
                  "template_path": "data/prompts/duo_chat/answering/empty.txt.example"
                }
              ]
            }
          },
          {
            "name": "claude-2",
            "prompt_template_config": {
              "templates": [
                {
                  "name": "baseline template",
                  "template_path": "data/prompts/duo_chat/answering/claude-2.txt.example"
                }
              ]
            }
          }
        ],
        "metrics": [
          {
            "metric": "independent_llm_judge",
            "evaluating_models": [
              {
                "name": "text-bison-32k@latest",
                "prompt_template_config": {
                  "templates": [
                    {
                      "name": "claude-2",
                      "template_path": "data/prompts/duo_chat/evaluating/claude-2.txt.example"
                    }
                  ]
                }
              }
            ]
          },
          {
            "metric": "similarity_score"
          }
        ]
      }
    }
  3. Check the output to confirm the text Unfortunately, there is no context provided to answer in the responses.

    yq -oy '.[] | .answer_from_comparison_model' data/output/experiment_demo_20240514_194311__similarity_score-00000-of-00001.csv  | rg "Unfortunately.*context" -c
    18 
  4. Check out this merge request's branch.

  5. Run the same pipeline, now with the updated template

  6. Check the output to confirm the text Unfortunately, there is no context provided to answer no longer in the responses.

    yq -oy '.[] | .answer_from_comparison_model' data/output/experiment_demo_20240514_194910__similarity_score-00000-of-00001.csv  | rg "Unfortunately.*context" -c

Merge request checklist

  • I've ran the affected pipeline(s) to validate that nothing is broken.
  • Tests added for new functionality. If not, please raise an issue to follow up.
  • Documentation added/updated, if needed.
Edited by Tan Le

Merge request reports