Artifact object store timing out on S3 upload
Summary
We've recently configured consolidated settings for object store, explicitly for artifacts. We're seeing Rack timeouts when attempting to upload artifacts past a certain size, making it appear as if object_store['direct_upload']
is not being honoured.
For a bit of context, we run our gitlab master instance on-prem, but run our runners/workers on EC2, with a Transit Gateway/VPN to connect both environments, so theoretically, direct upload would be advantageous as it would circumvent the extra hops required and push artifacts directly from the EC2 runner to S3.
Steps to reproduce
A local gitlab ee instance with the following config excerpt
gitlab_rails['object_store']['enabled'] = false
gitlab_rails['object_store']['proxy_download'] = false
gitlab_rails['object_store']['connection'] = {
'provider' => 'AWS',
'region' => 'eu-west-1',
'aws_access_key_id' => 'REDACTED',
'aws_secret_access_key' => 'REDACTED'
}
# Only enable artifacts for now SRE-2978
gitlab_rails['object_store']['objects']['artifacts']['bucket'] = '<S3 bucket>'
gitlab_rails['object_store']['objects']['external_diffs']['enabled'] = false
gitlab_rails['object_store']['objects']['lfs']['enabled'] = false
gitlab_rails['object_store']['objects']['lfs']['bucket'] = '<S3 bucket>'
gitlab_rails['object_store']['objects']['uploads']['enabled'] = false
gitlab_rails['object_store']['objects']['packages']['enabled'] = false
gitlab_rails['object_store']['objects']['packages']['bucket'] = '<S3 bucket>'
gitlab_rails['object_store']['objects']['dependency_proxy']['enabled'] = false
gitlab_rails['object_store']['objects']['dependency_proxy']['bucket'] = '<S3 bucket>'
gitlab_rails['object_store']['objects']['terraform_state']['enabled'] = false
gitlab_rails['object_store']['objects']['terraform_state']['bucket'] = '<S3 bucket'
Example Project
$ cat .gitlab-ci.yaml
stages:
- test
check-var:
stage: test
script:
- dd if=/dev/urandom of=file count=2MB bs=1024
artifacts:
name: target
when: on_success
expire_in: 8 hours
paths:
- file
What is the current bug behavior?
Job fails with an error, failed to upload artifact after retry. Exception log attached
What is the expected correct behavior?
Runner uploads to S3 directly avoiding any Rack timeouts
Relevant logs and/or screenshots
{
"time": "2020-10-16T12:12:16.346Z",
"severity": "INFO",
"duration_s": 59.91948,
"db_duration_s": 0.21071,
"view_duration_s": 59.70877,
"status": 500,
"method": "POST",
"path": "/api/v4/jobs/2713078/artifacts",
"params": [
{
"key": "artifact_format",
"value": "zip"
},
{
"key": "artifact_type",
"value": "archive"
},
{
"key": "expire_in",
"value": "8 hours"
},
{
"key": "file.path",
"value": ""
},
{
"key": "file.remote_id",
"value": "1602850047-25364-0003-6587-34e0ec207f267de28cf32a247a1b7904"
},
{
"key": "file.sha1",
"value": "5cae96d99a568a41408d84b4c1ed0f10e19c6857"
},
{
"key": "file.name",
"value": "target.zip"
},
{
"key": "file.remote_url",
"value": "https://<s3-bucket-url>"
},
{
"key": "file.size",
"value": "2686263404"
},
{
"key": "file.md5",
"value": "0af835e9b105d09913ce18d71d2b41bc"
},
{
"key": "file.sha256",
"value": "3e9fc9b0dac0b9d09f8ce87f6f85b07fb2564c73e7a5c44db5a193a67fee27c9"
},
{
"key": "file.sha512",
"value": "b78b277c69e164c39e63f9df141367e4f812f95654ef10ee0d3e7db4f7ecefa767b7fd4c7cab36585fe5b6b09b5f37d31ea3dd51b9d77e6c7359388f8a4661e1"
},
{
"key": "file.etag",
"value": ""
},
{
"key": "file.gitlab-workhorse-upload",
"value": "<upload blob>"
},
{
"key": "metadata.remote_id",
"value": ""
},
{
"key": "metadata.size",
"value": "1532"
},
{
"key": "metadata.name",
"value": "metadata.gz"
},
{
"key": "metadata.path",
"value": "/tmp/metadata.gz424789619"
},
{
"key": "metadata.sha256",
"value": "dacfb7eb41b13caa0670afdef49aafff8eac570988f2882db01ce89778bfe00a"
},
{
"key": "metadata.remote_url",
"value": ""
},
{
"key": "metadata.sha512",
"value": "5a7512f7f24ee6817ee23168a123c03009432eb79810de8b5411056c4a5c120da9df17a56e3a4be9fe1735e1a93d758bd2d691e95f87d8aa7aa55b6136837ee8"
},
{
"key": "metadata.md5",
"value": "43c3803651ae466f05b6b2e405d23a32"
},
{
"key": "metadata.sha1",
"value": "2a0a6f9fbc7c93f6b9e58a90cee9b00dab05248c"
},
{
"key": "metadata.gitlab-workhorse-upload",
"value": "<metadata blob>"
},
{
"key": "file",
"value": null
},
{
"key": "metadata",
"value": null
}
],
"host": "<hostname>",
"remote_ip": "<remote ip>",
"ua": "gitlab-runner 13.4.1 (13-4-stable; go1.13.8; linux/amd64)",
"route": "/api/:version/jobs/:id/artifacts",
"exception.class": "Rack::Timeout::RequestTimeoutException",
"exception.message": "Request ran for longer than 60000ms",
"exception.backtrace": [
"config/initializers/carrierwave_patch.rb:20:in `copy_to'",
"app/uploaders/object_storage.rb:378:in `store!'",
"app/services/ci/create_job_artifacts_service.rb:130:in `block in persist_artifact'",
"app/services/ci/create_job_artifacts_service.rb:129:in `persist_artifact'",
"app/services/ci/create_job_artifacts_service.rb:47:in `execute'",
"lib/api/ci/runner.rb:291:in `block (2 levels) in <class:Runner>'",
"ee/lib/gitlab/middleware/ip_restrictor.rb:14:in `block in call'",
"ee/lib/gitlab/ip_address_state.rb:10:in `with'",
"ee/lib/gitlab/middleware/ip_restrictor.rb:13:in `call'",
"lib/gitlab/metrics/elasticsearch_rack_middleware.rb:16:in `call'",
"lib/gitlab/middleware/rails_queue_duration.rb:33:in `call'",
"lib/gitlab/metrics/rack_middleware.rb:16:in `block in call'",
"lib/gitlab/metrics/transaction.rb:61:in `run'",
"lib/gitlab/metrics/rack_middleware.rb:16:in `call'",
"lib/gitlab/request_profiler/middleware.rb:17:in `call'",
"lib/gitlab/jira/middleware.rb:19:in `call'",
"lib/gitlab/middleware/go.rb:20:in `call'",
"lib/gitlab/etag_caching/middleware.rb:13:in `call'",
"lib/gitlab/middleware/multipart.rb:222:in `block in call'",
"lib/gitlab/middleware/multipart.rb:60:in `with_open_files'",
"lib/gitlab/middleware/multipart.rb:221:in `call'",
"lib/gitlab/middleware/read_only/controller.rb:51:in `call'",
"lib/gitlab/middleware/read_only.rb:18:in `call'",
"lib/gitlab/middleware/same_site_cookies.rb:27:in `call'",
"lib/gitlab/middleware/basic_health_check.rb:25:in `call'",
"lib/gitlab/middleware/handle_ip_spoof_attack_error.rb:25:in `call'",
"lib/gitlab/middleware/request_context.rb:23:in `call'",
"config/initializers/fix_local_cache_middleware.rb:9:in `call'",
"lib/gitlab/metrics/requests_rack_middleware.rb:60:in `call'",
"lib/gitlab/middleware/release_env.rb:12:in `call'"
],
"queue_duration_s": 4.946421,
"redis_calls": 5,
"redis_duration_s": 0.002505,
"redis_read_bytes": 734,
"redis_write_bytes": 346,
"redis_cache_calls": 5,
"redis_cache_duration_s": 0.002505,
"redis_cache_read_bytes": 734,
"redis_cache_write_bytes": 346,
"correlation_id": "8jYY0AFvmAa",
"meta.user": "<gitlab user>",
"meta.project": "<gitlab project path>",
"meta.root_namespace": "<gitlab project namespace",
"meta.subscription_plan": "free",
"meta.caller_id": "/api/:version/jobs/:id/artifacts"
}
Results of GitLab environment info
# gitlab-rake gitlab:env:info
System information
System: Ubuntu 16.04
Proxy: no
Current User: git
Using RVM: no
Ruby Version: 2.6.6p146
Gem Version: 2.7.10
Bundler Version:1.17.3
Rake Version: 12.3.3
Redis Version: 5.0.9
Git Version: 2.28.0
Sidekiq Version:5.2.9
Go Version: unknown
GitLab information
Version: 13.4.1-ee
Revision: 4b9c8135cd9
Directory: /opt/gitlab/embedded/service/gitlab-rails
DB Adapter: PostgreSQL
DB Version: 11.7
URL: https://<url>
HTTP Clone URL: https://<url>/some-group/some-project.git
SSH Clone URL: git@<url>:some-group/some-project.git
Elasticsearch: yes
Geo: no
Using LDAP: no
Using Omniauth: yes
Omniauth Providers: saml
GitLab Shell
Version: 13.7.0
Repository storage paths:
- default: /var/opt/gitlab/git-data/repositories
GitLab Shell path: /opt/gitlab/embedded/service/gitlab-shell
Git: /opt/gitlab/embedded/bin/git```
Results of GitLab application Check
Attached check.txt
Possible fixes
Only saw the MR/line responsible for setting direct_upload
to true by default when using consolidated config.