Reproduce paper results

Dear Mayer,

Running your code I don't get the same results as those reported in the paper. I used the run_train.sh script as-is. I show the two results in the following table. Also, I attach the outputs (predictions and evaluation results) for the three test sets.

Please, I need to get accurate results because I want to compare a model of mine on your dataset and also run your model on a dataset of mine.

Source Embedding Neoplasm Glaucoma Mixed
f1 F1 C-F1 E-F1 f1 F1 C-F1 E-F1 f1 F1 C-F1 E-F1
Paper fine-tuning SciBERT 0.90 0.87 0.88 0.92 0.91 0.89 0.93 0.91 0.91 0.88 0.90 0.93
My run fine-tuning SciBERT 0.90 0.83 0.75 0.91 0.92 0.84 0.83 0.92 0.90 0.83 0.77 0.91

sequence_tagging_predictions_neoplasm.conll

sequence_tagging_predictions_mixed.conll

sequence_tagging_predictions_glaucoma.conll

eval_results_neoplasm.txt

eval_results_mixed.txt

eval_results_glaucoma.txt