This MR adds eval_notebook to this repository as an alternative to evaluate QA with HumanEval benchmark
eval_notebook