Project export fails with message "command exited with error code 15"
Update
This is docs-only - see comment.
Summary
Project Export worker fails with the error message
command exited with error code 15 and Unable to save [FILTERED] into [FILTERED]
The error is raised in this line when the module Gitlab::ImportExport::CommandLineUtil
is compressing the project exported files in a tarball and the tar
command returns with an exit code that is different from 0 (success).
CODE 15
In Linux, code 15 is the same as a SIGTERM, which is the exit code that a process returns when it is requested to terminate gracefully.
Usually, the SIGTERM is emitted by the operating system to the running processes before shutting down the server, warning the processes that they should stop what they are doing because they will be killed soon.
Project export problem
To use the tar
command to compress or decompress files, Import/Export uses Open3
, which creates a subprocess and makes the Ruby code wait for the execution of the subprocess and its exit code.
The problem is that if a SIGTERM is emitted during the execution of the tar
command, the command stops and returns an error 15 to the Ruby process, indicating that it was interrupted. However, when that happens, Import/Export marks the export process as failed and doesn't put the worker back in the Sidekiq queue. Therefore the project is never exported.
Note that the SIGTERM usually isn't a problem when it's emitted during other steps of the export process because, in this case, the worker is put back in the queued.
Steps to reproduce
- Request a project to be exported
- Keep monitoring the logs for the moment the project is being compressed
- Stop Sidekiq
- The error
command exited with error code 15, Unable to save [FILTERED] into [FILTERED]
will appear in theexporter.log
What is the expected correct behavior?
The project export worker should be retried when a code 15 is returned by the tar
command.
Relevant logs and/or screenshots
Note
This issue might affect other features using the same code to compress and decompress files.
Possible affected features:
- Group export
- Create project from a template
- GitLab Importer
Possible fixes
tar
exit code 15 should be handled properly.
In most cases, I believe we can retry the worker instead of marking it as failed
Refinement note
@rodrigo.tomonari investigated why retries on ProjectExportWorker
were disabled. In relevant Disable sidekiq retries for Project/Group Impor... (!35344 - merged) it was identified that retrying the worker wasn't helpful since, most of the time, the job would fail in all retries.
Assuming that this is still the case, we mark this issue as blocked by Parallelize ProjectExportWorker to improve its ... (&7940 - closed) epic related issues, as we believe the error will less occur after breaking down the ProjectExportWorker
into multiple workers.