-
Collective LLM Judge 0 of 3 checklist items completed
- Merged
- 41
- Approved
updated -
Add option to change the BigQuery write dispositon 3 of 3 checklist items completed
- Merged
- 2
- Approved
updated -
Revert "Merge branch 'ac/replicate-binary-qa-evaluation' into 'main'" 0 of 3 checklist items completed
- Merged
- 11
- Approved
updated -
Add pipeline start timestamp 1 of 3 checklist items completed
- Merged
- 3
- Approved
updated -
Add pipeline to run Duo Chat evaluation locally without using GitLab API 2 of 3 checklist items completed
- Merged
- 3
updated -
Specify each metric separately in the config 1 of 3 checklist items completed
- Merged
- 14
- 1
- Approved
updated -
Use md5 for stable and unique question ID 0 of 3 checklist items completed
- Merged
- 1
- 1
- Approved
updated -
Allow get_response to retry a few times 1 of 3 checklist items completed
- Merged
- 16
- Approved
updated -
Fix type hint error 0 of 3 checklist items completed
- Merged
- 9
- Approved
updated -
Change log msg to debug 0 of 3 checklist items completed
- Merged
- Approved
updated -
Add graphviz as a dependency 1 of 3 checklist items completed
- Merged
- 4
- Approved
updated -
Urgent: Hotfix: Use correct id for each question and added additional check for that. 0 of 3 checklist items completed
- Merged
- 3
updated -
Add latest to docker images 0 of 3 checklist items completed
- Merged
- 8
- Approved
updated -
Update duo chat howto authentication 1 of 3 checklist items completed
- Merged
- 3
- 1
- Approved
updated -
Fix image in the docker example 0 of 3 checklist items completed
- Merged
- 1
- 1
- Approved
updated -
Add throttle job to the comparison batches 0 of 3 checklist items completed
- Merged
- Approved
updated -
Fix tokenizers dependencies 0 of 3 checklist items completed
- Merged
- 2
updated -
Replace tokenizer in transfomers with tokenizers 1 of 3 checklist items completed
- Merged
- 2
- Approved
updated -
Added text-bison-32k model 0 of 3 checklist items completed
- Merged
- Approved
updated -
Fix skipping duo chat on mbpp 0 of 3 checklist items completed
- Merged
- 13
- 1
- Approved
updated