Investigate and fix RepositoryImportWorker job failures
For the last 7 days RepositoryImportWorker
had 528 jobs with status fail
. Reviewing the failures we can see some of the examples:
Error importing repository into ... - No such file or directory @ rb_sysopen - [FILTERED]
Error importing repository into ... - command exited with error code 2: gzip: stdin: not in gzip format
Error importing repository ... - 13:creating repository: cloning repository: exit status 128, stderr: "fatal: could not read Username for [FILTERED] terminal prompts disabled\n"
Error importing repository ... into - Rate limit for this resource has been exceeded
Glancing through the logs the first error about no file is the biggest offender. We should investigate these errors to see if there's a way to fix them.
Proposed solution
Errors that are coming from external systems shouldn't fail the Sidekiq job and contribute towards our error budget.
We should mark Sidekiq jobs to done
and import to failed
where appropriate, like for all the "external" errors, that we cannot fix in our code.
Note: That should be a quick win. With that, the visibility of errors to the user won't get worse. The failures will be still visible to the user, at least the 'last one', since in project_import_state
we have last_error
db column that we should populate and it will be presented to the user after import has been marked as failed.