Inject `gitlab_model` data into `results/v2` for evaluation
This MR injects the gitlab_model
data collected in the first iteration of data collection into the current production data collected. The MR also includes the manual script created to inject the gitlab_model
data into the current production data.
Note that there are two unique differences to the gitlab_model
data:
- There are some instances where there was no response from the
gitlab_model
. This is not unique to thegitlab_model
as the other models also often return no response. But, in the second iteration of data collection, for all models except thegitlab_model
this was taken care of by ensuring a model response done in this MR: Follow-up: Ensure model response capture during... (!11 - merged). This was not done for thegitlab_model
because the triton server had already been shut down by the time v2 data was collected. - Each language/prompt combination is ran 10 times duration data collection to generate
average_duration
. In the first iteration of data collection, this data was not stored. As a result, thegitlab_model
won't include araw_durations
array.