Add Claude 3 Labrador model
What does this merge request do and why?
Relates to https://gitlab.com/gitlab-org/gitlab/-/issues/465576
gitlab-org/modelops/ai-model-validation-and-research/annoucements#23 (moved)
How to set up and validate locally
-
Ensure GCP environment variables and Anthropic API key are setup.
-
Check out to this merge request's branch.
-
Use the following config to evaluate the new model
{ "beam_config": { "pipeline_options": { "runner": "DirectRunner", "project": "dev-ai-research-0e2f8974", "region": "us-central1", "temp_location": "gs://prompt-library/tmp/", "save_main_session": false } }, "input_source": { "type": "bigquery", "path": "dev-ai-research-0e2f8974.duo_chat.sampled_code_generation_v1" }, "output_sinks": [ { "type": "local", "path": "data/output", "prefix": "experiment_code_generation" }, { "type": "bigquery", "path": "dev-ai-research-0e2f8974.duo_chat_experiments", "prefix": "tle_claude_labrador_20240618" } ], "throttle_sec": 0.1, "batch_size": 10, "eval_setup": { "answering_models": [ { "name": "claude-3-labrador", "prompt_template_config": { "templates": [ { "name": "claude", "template_path": "data/prompts/duo_chat/answering/claude-2.txt.example" } ] } }, { "name": "claude-3-sonnet", "prompt_template_config": { "templates": [ { "name": "claude", "template_path": "data/prompts/duo_chat/answering/claude-2.txt.example" } ] } } ], "metrics": [ { "metric": "similarity_score" }, { "metric": "collective_llm_judge", "evaluating_models": [ { "name": "text-bison-32k@latest", "prompt_template_config": { "templates": [ { "name": "claude-2-collective", "template_path": "data/prompts/duo_chat/evaluating/claude-2-collective-xml.txt.example" } ] } } ] } ] } }
-
Run the follow command to kick off the pipeline.
poetry run promptlib duo-chat eval --config-file data/config/duo_chat/eval_code_generation.json --sample-size 1 --test-run
- https://console.cloud.google.com/bigquery?ws=!1m5!1m4!4m3!1sdev-ai-research-0e2f8974!2sduo_chat_experiments!3stle_claude_labrador_20240618__collective_llm_judge
- https://console.cloud.google.com/bigquery?ws=!1m5!1m4!4m3!1sdev-ai-research-0e2f8974!2sduo_chat_experiments!3stle_claude_labrador_20240618__similarity_score
Merge request checklist
-
I've ran the affected pipeline(s) to validate that nothing is broken. -
Tests added for new functionality. If not, please raise an issue to follow up. -
Documentation added/updated, if needed.
Edited by Mon Ray