Avoid mocking API calls when running swebench
What does this merge request do and why?
This MR removes all stubs to run SWEBench for the software development flow. All flows run similar to other flows against the GDK with checkpoints saved to Postgres. This has been achieved by introducing a host_project parameter, which hosts all Duo Workflow DB Records. The software development flow doesn't make changes in GDK (READ_WRITE_FILES local only), so we're safe to use this method until we obtain seeds for the issue-to-MR flow evals.
Notes:
We use the host_project primarily to create workflow records in GDK and to successfully start software_development flows.
The flow itself runs in the local repository without pushing any changes to the GitLab project (READ_WRITE_FILES only),
so it’s safe to use a single empty project as a stub for all projects in the SWEBench dataset.
Ideally, we would push all SWEBench projects to GDK; however, this would require writing a separate Rake task
and optimizing it for seeding many heavy projects, which hasn’t been done yet.
We treat host_project as a temporary workaround that allows us to move away from using API stubs.
Merge request checklist
-
I've ran the affected pipeline(s) to validate that nothing is broken. -
Tests added for new functionality. If not, please raise an issue to follow up. -
Documentation added/updated, if needed - will be covered in a follow-up