Job Artifacts download at inconsistent sizes
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Summary
A customer reported that when they are downloading job artifacts, they appear to download at inconsistent sizes, almost as if it is not completing all the way. As an example, they attempted to download an artifact and it ended up with these results:
-rw-r--r--@ 1 REDACTED staff 9641538 Apr 6 10:58 REDACTED-sql (1).log
-rw-r--r--@ 1 REDACTED staff 8748609 Apr 6 10:58 REDACTED-sql (2).log
-rw-r--r--@ 1 REDACTED staff 10862145 Apr 6 10:58 REDACTED-sql (3).log
-rw-r--r--@ 1 REDACTED staff 11102693 Apr 6 10:58 REDACTED-sql (4).log
-rw-r--r--@ 1 REDACTED staff 7773761 Apr 6 10:58 REDACTED-sql (5).log
-rw-r--r--@ 1 REDACTED staff 5443137 Apr 6 10:58 REDACTED-sql (6).log
-rw-r--r--@ 1 REDACTED staff 4972097 Apr 6 10:58 REDACTED-sql (7).log
-rw-r--r--@ 1 REDACTED staff 7302721 Apr 6 10:58 REDACTED-sql (8).log
-rw-r--r--@ 1 REDACTED staff 12181057 Apr 6 10:58 REDACTED-sql (9).log
-rw-r--r--@ 1 REDACTED staff 15564113 Apr 6 10:58 REDACTED-sql.log
Interestingly, only some of the GitLab Support team was able to replicate, and we also experienced some inconsistent results. Downloading an artifact through our browser, I also saw some inconsistent downloads:
-rw-r--r--@ 1 cleveland staff 102M Apr 7 15:30 file0 (1).log
-rw-r--r--@ 1 cleveland staff 99M Apr 7 15:30 file0 (2).log
-rw-r--r--@ 1 cleveland staff 93M Apr 7 15:30 file0.log
The file above should be 123MB, but it never fully completed. Tailing the ends of the files, it appears that they are truncated:
cleveland in ~/Downloads > tail file0.log
Content Redacted
[2020-03-19T18:11%
cleveland in ~/Downloads > tail file0\ \(1\).log
Content Redacted
[2020-03-19T18:11:23.959315][DEBUG] ...
Steps to reproduce
- Log into GitLab.com as an Admin.
- Download a job artifact directly via your browser.
- Download it multiple times and take note of the size of the file.
OR
- Download a job artifact directly using the API, for example:
curl -o ~/output.log -H "PRIVATE-TOKEN: <YOUR TOKEN>" "https://gitlab.com/api/v4/projects/PROJECT_ID/jobs/JOB_ID/artifacts/path/to/artifact/file0.log"
- Download it multiple times and take note of the size of the file.
What is the current bug behavior?
Artifacts download at inconsistent sizes, seemingly getting killed or timing out prematurely before the download fully completes.
What is the expected correct behavior?
Each time you download an artifact, it should download consistently at the same size and complete.
Relevant logs and/or screenshots
NA
Output of checks
This bug happens on GitLab.com
Other Notes
Both GitLab Support and the customer tried to download via the API, and it was seemingly more consistent. Using the API is the current workaround that appears to be working best.
ZD Ticket (Internal): https://gitlab.zendesk.com/agent/tickets/152676