Add GPT-4o model (!459) · Merge requests · GitLab.org / ModelOps / AI Model Validation and Research / AI Evaluation / Prompt Library

Tan Le requested to merge add-gpt-o-model into main May 17, 2024

What does this merge request do and why?

Add GPT-4o model support.

Model card: https://platform.openai.com/docs/models/gpt-4o
Example Duo Chat eval output on code generation sample dataset.

How to set up and validate locally

Ensure GCP environment variables are setup.
Check out to this merge request's branch.

Copy the following config and place it in data/config/duochat_gpt_test.json.

{
  "beam_config": {
    "pipeline_options": {
      "runner": "DirectRunner",
      "project": "dev-ai-research-0e2f8974",
      "region": "us-central1",
      "temp_location": "gs://prompt-library/tmp/",
      "save_main_session": false
    }
  },
  "input_source": {
    "type": "bigquery",
    "path": "dev-ai-research-0e2f8974.duo_chat.sampled_code_generation_v1"
  },
  "output_sinks": [
    {
      "type": "bigquery",
      "path": "dev-ai-research-0e2f8974.duo_chat_experiments",
      "prefix": "tl_gpt_4o_code_generation"
    },
    {
      "type": "local",
      "path": "data/output",
      "prefix": "experiment"
    }
  ],
  "throttle_sec": 1,
  "batch_size": 10,
  "eval_setup": {
    "answering_models": [
      {
        "name": "gpt-4o",
        "prompt_template_config": {
          "templates": [
            {
              "name": "empty",
              "template_path": "data/prompts/duo_chat/answering/empty.txt.example"
            }
          ]
        }
      }
    ],
    "metrics": [
      {
        "metric": "independent_llm_judge",
        "evaluating_models": [
          {
            "name": "text-bison-32k@latest",
            "prompt_template_config": {
              "templates": [
                {
                  "name": "claude-2",
                  "template_path": "data/prompts/duo_chat/evaluating/claude-2.txt.example"
                }
              ]
            }
          }
        ]
      }
    ]
  }
}

Run the follow command to kick off the pipeline.

poetry run promptlib duo-chat eval --config-file data/config/duochat_gpt_test.json --test-run --sample-size 1

Merge request checklist

I've ran the affected pipeline(s) to validate that nothing is broken.
Tests added for new functionality. If not, please raise an issue to follow up.
Documentation added/updated, if needed.

Related to #297

Edited May 17, 2024 by Tan Le

Add GPT-4o model

What does this merge request do and why?

How to set up and validate locally

Merge request checklist

Merge request reports