Job Artifacts download at inconsistent sizes

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Summary

A customer reported that when they are downloading job artifacts, they appear to download at inconsistent sizes, almost as if it is not completing all the way. As an example, they attempted to download an artifact and it ended up with these results:

-rw-r--r--@ 1 REDACTED staff 9641538 Apr 6 10:58 REDACTED-sql (1).log
-rw-r--r--@ 1 REDACTED staff 8748609 Apr 6 10:58 REDACTED-sql (2).log
-rw-r--r--@ 1 REDACTED staff 10862145 Apr 6 10:58 REDACTED-sql (3).log
-rw-r--r--@ 1 REDACTED staff 11102693 Apr 6 10:58 REDACTED-sql (4).log
-rw-r--r--@ 1 REDACTED staff 7773761 Apr 6 10:58 REDACTED-sql (5).log
-rw-r--r--@ 1 REDACTED staff 5443137 Apr 6 10:58 REDACTED-sql (6).log
-rw-r--r--@ 1 REDACTED staff 4972097 Apr 6 10:58 REDACTED-sql (7).log
-rw-r--r--@ 1 REDACTED staff 7302721 Apr 6 10:58 REDACTED-sql (8).log
-rw-r--r--@ 1 REDACTED staff 12181057 Apr 6 10:58 REDACTED-sql (9).log
-rw-r--r--@ 1 REDACTED staff 15564113 Apr 6 10:58 REDACTED-sql.log

Interestingly, only some of the GitLab Support team was able to replicate, and we also experienced some inconsistent results. Downloading an artifact through our browser, I also saw some inconsistent downloads:

-rw-r--r--@  1 cleveland  staff   102M Apr  7 15:30 file0 (1).log
-rw-r--r--@  1 cleveland  staff    99M Apr  7 15:30 file0 (2).log
-rw-r--r--@  1 cleveland  staff    93M Apr  7 15:30 file0.log

The file above should be 123MB, but it never fully completed. Tailing the ends of the files, it appears that they are truncated:

cleveland in ~/Downloads  > tail file0.log
Content Redacted
[2020-03-19T18:11%                                                                                                                                                                                                            
cleveland in ~/Downloads  > tail file0\ \(1\).log
Content Redacted
[2020-03-19T18:11:23.959315][DEBUG] ...

Steps to reproduce

  1. Log into GitLab.com as an Admin.
  2. Download a job artifact directly via your browser.
  3. Download it multiple times and take note of the size of the file.

OR

  1. Download a job artifact directly using the API, for example:
curl -o ~/output.log -H "PRIVATE-TOKEN: <YOUR TOKEN>" "https://gitlab.com/api/v4/projects/PROJECT_ID/jobs/JOB_ID/artifacts/path/to/artifact/file0.log"
  1. Download it multiple times and take note of the size of the file.

What is the current bug behavior?

Artifacts download at inconsistent sizes, seemingly getting killed or timing out prematurely before the download fully completes.

What is the expected correct behavior?

Each time you download an artifact, it should download consistently at the same size and complete.

Relevant logs and/or screenshots

NA

Output of checks

This bug happens on GitLab.com

Other Notes

Both GitLab Support and the customer tried to download via the API, and it was seemingly more consistent. Using the API is the current workaround that appears to be working best.

ZD Ticket (Internal): https://gitlab.zendesk.com/agent/tickets/152676

Edited by 🤖 GitLab Bot 🤖