Invalid Unicode error via Direct Transfer when tar fails on the source server
Summary
A customer experienced an Invalid Unicode
error (GitLab 16.9.0) when attempting to transfer a repository via the Direct Transfer method. The exception was the following:
"Invalid Unicode [5a 3f d6 b7 1f] at 1",
"exception.backtrace":
"lib/gitlab/json.rb:130:in `dump'",
"lib/gitlab/json.rb:130:in `adapter_dump'",
"lib/gitlab/json.rb:52:in `dump'",
"lib/gitlab/json_logger.rb:49:in `dump_json'",
"app/services/bulk_imports/file_download_service.rb:94:in `raise_error'",
"lib/bulk_imports/file_downloads/validations.rb:44:in `validate_size!'",
"app/services/bulk_imports/file_download_service.rb:80:in `block (2 levels) in download_file'"
<TRUNCATED>
The repository_bundle_pipeline.rb calls the the download service. The validate_size!
method is being called which downloads the file in chunks, so whatever is being sent as the size
is failing and ends up with an invalid unicode
error. size
should be an integer, so the file being downloaded is corrupted in some way.
When attempting to call the project relations export API from the source server manually, we ended up in a state where the LFS objects relations failed and ended up with an error:
no space left on device - sendfile
command exited with error code 2: tar: ./lfs_objects.tar: file is the archive; not dumped\ntar: /tmp/bulk_imports1234/lfs_objects.tar: Wrote only 4096 of 10240 bytes\ntar: Error is not recoverable: exiting now
The error itself stems from tar which is not able to finish writing due to the tmpdir
directory path running out of space. In this case, Dir.tmpdir
is /tmp
and the customer did not account for needing so much storage to /tmp
during the direct transfer process. There are a few problems with this:
- The Invalid Unicode error is not useful and is the result of a partially downloaded file
- We shouldn't partially download the file if it fails to extract on the source server. It should fail and then tell the target instance before the download starts
- We should surface a more clear user friendly error ("Tar failed on source server", "Export could not be completed, check logs on source server") etc.
- We should consider checksum validation on the files be transferred for data integrity
- We don't document the usage of
tmpdir
so it's unclear that you would need available/tmp
space to do the direct transfer due to the creation of the tarball. - We should consider making
tmpdir
configurable (This may only affect local storage environments??)
What is the current bug behavior?
Direct Transfer fails with Invalid Unicode
when it attempts to download an incomplete file. The tar process can fail on the source server due to space issues or other problems and not get surfaced to the user.
What is the expected correct behavior?
The Direct Transfer should present a more clear error and we should fail the transfer before attempting to download the file.
Relevant logs and/or screenshots
Output of checks
Results of GitLab environment info
Expand for output related to GitLab environment info
(For installations with omnibus-gitlab package run and paste the output of: \\\\\\\`sudo gitlab-rake gitlab:env:info\\\\\\\`) (For installations from source run and paste the output of: \\\\\\\`sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production\\\\\\\`)
Results of GitLab application Check
Expand for output related to the GitLab application check
(For installations with omnibus-gitlab package run and paste the output of: \\\`sudo gitlab-rake gitlab:check SANITIZE=true\\\`) (For installations from source run and paste the output of: \\\`sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true\\\`) (we will only investigate if the tests are passing)