Implement statistical stopper
Now the bench_runner can stop based on the data it has seen so far; it can stop the benchmarking process if the data is stable, or the benchmark results are degraded for some consecutive epochs
Now the bench_runner can stop based on the data it has seen so far; it can stop the benchmarking process if the data is stable, or the benchmark results are degraded for some consecutive epochs