Add remaining metrics to the ETV pipeline
What does this merge request do and why?
When adding the ETV pipeline, only the independent LLM judge was added for simplicity. This MR adds the similarity score and collective LLM judge metrics.
How to set up and validate locally
Pipeline results can be found under dev-ai-research-0e2f8974.duo_chat_experiments.aherczeg_etv_metrics__
Merge request checklist
-
I've ran the affected pipeline(s) to validate that nothing is broken. -
Tests added for new functionality. If not, please raise an issue to follow up. -
Documentation added/updated, if needed.
Edited by Andras Herczeg