-
Add support to evaluate with Claude 3 Haiku 1 of 3 checklist items completed
- Merged
- 1
- Approved
updated -
Support code explanation dataset on Duo Chat eval pipeline 1 of 3 checklist items completed
- Merged
- 14
- Approved
updated -
Update dependency pytest to v8.1.1 0 of 1 checklist item completed
- Merged
- Approved
updated -
Update dependency google-cloud-bigquery to v3.19.0 0 of 1 checklist item completed
- Merged
- 1
- Approved
updated -
Update docker Docker tag to v25.0.4 0 of 1 checklist item completed
- Merged
- 1
- Approved
updated -
Update dependency evilmartians/lefthook to v1.6.5 0 of 1 checklist item completed
- Merged
- Approved
updated -
Add details on GitLab PAT scopes 0 of 3 checklist items completed
- Merged
- 1
- Approved
updated -
Add Gemini Pro 1.5 model support 1 of 3 checklist items completed
- Merged
- 7
- Approved
updated -
Use nested dataclass to preserve all information 1 of 3 checklist items completed
- Merged
- 1
- Approved
updated -
Add claude-3 1 of 3 checklist items completed
- Merged
- 8
- Approved
updated -
Log error on duplicate results instead of raising RuntimeError 1 of 3 checklist items completed
- Merged
- 1
- Approved
updated -
Draft: Added prompt id and answer id. 0 of 3 checklist items completed
-
Update dependency poetry to v1.8.2 0 of 1 checklist item completed
- Merged
- Approved
updated -
Update dependency pytest to v8.0.2 0 of 1 checklist item completed
- Merged
- Approved
updated -
Update dependency pydantic to v2.6.3 0 of 1 checklist item completed
- Merged
- Approved
updated -
Update dependency evilmartians/lefthook to v1.6.4 0 of 1 checklist item completed
- Merged
- Approved
updated -
Ensure tqdm instance is closed on complete process 0 of 3 checklist items completed
- Merged
- 2
- Approved
updated -
Remove repeated pairs in similarity score. 0 of 3 checklist items completed
- Merged
- 13
- Approved
updated -
Add test coverage report in CI 1 of 3 checklist items completed
- Merged
- 4
- Approved
updated -
Added ability to compare with ground truth 0 of 3 checklist items completed
- Merged
- 1
- Approved
updated