Evaluation for Model Updates
Overview
This is to document the perplexity of our Model Update.
Perplexity
LOWER IS BETTER! Learn about
16b_v5
Perplexity Update for 8th May, model id: Model step: 4.2 K Steps. We can see
- Slightly worse on go, java, javascript, python
- Better on c, c#, c++, php, ruby, rust, scala, typescript
Next Steps: We continue to further train with a checkpoint later in the week. Our hypothesis is further validated through the steps of using bfloat16
16b_v4
Perplexity Update for 1st May, model id: Model step: 1.5 K Steps. We can see the performance is better for rust and scala but the performance is worse for the rest of languages.
Next Steps: We continue to further train with a checkpoint tomorrow. Continue working on hypothesis bug.
Edited by Taylor McCaslin