Ensure tqdm instance is closed on complete process (!306) · Merge requests · GitLab.org / ModelOps / AI Model Validation and Research / AI Evaluation / Prompt Library

Tan Le requested to merge fix-open-tqdm-progress into main Mar 01, 2024

What does this merge request do and why?

Ensure tqdm instance is closed on complete process (via end_bundle hook).

This fixes an issue with log message interleaving tqdm progress indicator.

Before

❯ poetry run promptlib duo-chat eval --test-run --sample-size 1 --config-file=data/config/duochat_eval_config.json                                                                              
Requesting answers from duo-chat: 1it [00:11, 11.56s/iINFO:promptlib.common.beam.io:Output written to BigQuery: dev-ai-research-0e2f8974:duo_chat_experiments.tl_output_sinks_log_20240301_22084
5__independent_llm_judge, 28.93s/it]                                                                                                                                                            
INFO:promptlib.common.beam.io:Output written to BigQuery: dev-ai-research-0e2f8974:duo_chat_experiments.tl_output_sinks_log_20240301_220844__similarity_score                                   
INFO:promptlib.common.beam.io:Output written to CSV: data/output/experiment_20240301_220845__independent_llm_judge-00000-of-00001.csv                                                           
INFO:promptlib.common.beam.io:Output written to CSV: data/output/experiment_20240301_220844__similarity_score-00000-of-00001.csv                                                                
Requesting answers from claude-2: 1it [01:16, 76.56s/it]                                                                                                                                        
Requesting answers from duo-chat: 1it [01:16, 76.57s/it]                                                                                                                                        
Getting evaluation from text-bison@latest: 2it [00:47, 23.71s/it]                                                                                                                               
Calculating similarity scores.: 1it [00:38, 38.86s/it]

After

❯ poetry run promptlib duo-chat eval --test-run --sample-size 2 --config-file=data/config/duochat_eval_config.json
Requesting answers from claude-2: 2it [00:30, 15.33s/it]                                                                                                                                        
Requesting answers from duo-chat: 2it [00:30, 15.33s/it]                                                                                                                                        
Getting evaluation from text-bison@latest: 4it [00:08,  2.17s/it]                                                                                                                               
Calculating similarity scores: 4it [00:03,  1.02it/s]                                                                                                                                           
INFO:promptlib.common.beam.io:Output written to BigQuery: dev-ai-research-0e2f8974:duo_chat_experiments.tl_output_sinks_log_20240302_003329__independent_llm_judge
INFO:promptlib.common.beam.io:Output written to BigQuery: dev-ai-research-0e2f8974:duo_chat_experiments.tl_output_sinks_log_20240302_003328__similarity_score
INFO:promptlib.common.beam.io:Output written to CSV: data/output/experiment_20240302_003329__independent_llm_judge-00000-of-00001.csv
INFO:promptlib.common.beam.io:Output written to CSV: data/output/experiment_20240302_003328__similarity_score-00000-of-00001.csv

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

Merge request checklist

I've ran the affected pipeline(s) to validate that nothing is broken.
Tests added for new functionality. If not, please raise an issue to follow up.
Documentation added/updated, if needed.

Edited Mar 01, 2024 by Tan Le

Ensure tqdm instance is closed on complete process

What does this merge request do and why?

Before

After

How to set up and validate locally

Merge request checklist

Merge request reports