504 Gateway Timeout trying to push to registry
Summary
Cannot push large layers to the registry. Smaller layers keep having to retry.
Steps to reproduce
- Set up a repository with a Dockerfile that will push at least one large layer, e.g. httpd:alpine or ruby:2.5.1-alpine
- Run Auto DevOps pipeline, or clone the repository locally, log into the registry, and try to push it
Configuration used
Modified values.yaml from master
diff --git a/values.yaml b/values.yaml
index 0411c099..bf9c9219 100644
--- a/values.yaml
+++ b/values.yaml
@@ -22,7 +22,7 @@ global:
enabled: false
## doc/installation/deployment.md#deploy-the-community-edition
- # edition: ee
+ edition: ce
## doc/charts/globals.md#gitlab-version
# gitlabVersion: master
@@ -34,7 +34,7 @@ global:
allowClusterRoles: true
## doc/charts/globals.md#configure-host-settings
hosts:
- domain: example.com
+ domain: mydomain.com
# hostSuffix:
https: true
externalIP:
@@ -177,15 +177,15 @@ global:
## doc/charts/globals.md#incoming-email-settings
## doc/installation/deployment.md#incoming-email
incomingEmail:
- enabled: false
- address: ""
+ enabled: true
+ address: "gitlab@mydomain.com"
host: "imap.gmail.com"
port: 993
ssl: true
startTls: false
- user: ""
+ user: "gitlab@mydomain.com"
password:
- secret: ""
+ secret: "smtp-password"
key: password
mailbox: inbox
idleTimeout: 60
@@ -254,25 +254,25 @@ global:
## doc/installation/deployment.md#outgoing-email
## Outgoing email server settings
smtp:
- enabled: false
- address: smtp.mailgun.org
- port: 2525
- user_name: ""
+ enabled: true
+ address: smtp.sendgrid.net
+ port: 587
+ user_name: "apikey"
## doc/installation/secrets.md#smtp-password
password:
- secret: ""
+ secret: "smtp-password"
key: password
# domain:
- authentication: "plain"
- starttls_auto: false
+ authentication: "login"
+ starttls_auto: true
openssl_verify_mode: "peer"
## doc/installation/deployment.md#outgoing-email
## Email persona used in email sent by GitLab
email:
- from: ''
+ from: 'gitlab@mydomain.com'
display_name: GitLab
- reply_to: ''
+ reply_to: 'gitlab@mydomain.com'
subject_suffix: ''
## Timezone for containers.
@@ -301,10 +301,10 @@ global:
## End of global
## Settings to for the Let's Encrypt ACME Issuer
-# certmanager-issuer:
+certmanager-issuer:
## The email address to register certificates requested from Let's Encrypt.
## Required if using Let's Encrypt.
- # email: email@example.com
+ email: administrator@mydomain.com
## Installation & configuration of stable/cert-manager
## See requirements.yaml for current version
@@ -431,6 +431,7 @@ gitlab-runner:
create: true
runners:
locked: false
+ privileged: true
cache:
cacheType: s3
s3BucketName: runner-cache
Environment:
- K8s v1.14.1 on bare metal, deployed via kubespray v2.10.0 on single Debian 9.9 node. All default inventory settings except: flannel instead of calico, kube_proxy_mode iptables, and helm enabled
- Rook installed via latest helm chart (v1.0.1),
helm install --namespace rook-ceph rook-release/rook-ceph --set agent.flexVolumeDirPath=/var/lib/kubelet/volume-plugins
with cluster-test.yml and storageclass-test.yml. This storage class was made the default. - GitLab installed via helm (11.11)
- nginx ingress service changed from LoadBalancer to externalIP
Stream of consciousness available in my forum post: https://forum.gitlab.com/t/runner-cannot-push-all-layers-to-registry-504-gateway-timeout-self-hosted-kubernetes-cloud-native-helm-chart/26720
Current behavior
Large layers (> 100 MB) are failing to push. Upload performance is slow.
Expected behavior
It should be fast and it should work every time
Versions
- Chart: 1.9.0
- Platform:
- Self-hosted: kubespray v2.10.0 on Debian 9.9
- Kubernetes: (
kubectl version
)- Client: 1.14.1
- Server: 1.14.1
- Helm: (
helm version
)- Client: 2.13.1
- Server: 2.13.1
fury@cassiopeia:~/gitlab$ helm ls
NAME REVISION UPDATED STATUS CHART APP VERSION NAMESPACE
gitlab 4 Tue May 28 09:31:23 2019 DEPLOYED gitlab-1.9.0 11.11.0 default
ideal-worm 1 Mon May 27 13:43:07 2019 DEPLOYED rook-ceph-v1.0.1 rook-ceph
Relevant logs
End of job output:
$ docker push "$BUILD_IMAGE_NAME"
The push refers to repository [registry.mydomain.com/web/auto-build-image/master]
0a667c142b26: Preparing
1e8ec32b2f91: Preparing
a21c0a6873db: Preparing
c895bf09456a: Preparing
968d46c1d20e: Preparing
b87598efb2f0: Preparing
f1b5933fe4b5: Preparing
b87598efb2f0: Waiting
f1b5933fe4b5: Waiting
1e8ec32b2f91: Layer already exists
968d46c1d20e: Layer already exists
b87598efb2f0: Layer already exists
a21c0a6873db: Layer already exists
f1b5933fe4b5: Layer already exists
0a667c142b26: Pushed
c895bf09456a: Retrying in 5 seconds
c895bf09456a: Retrying in 4 seconds
c895bf09456a: Retrying in 3 seconds
c895bf09456a: Retrying in 2 seconds
c895bf09456a: Retrying in 1 second
c895bf09456a: Retrying in 10 seconds
c895bf09456a: Retrying in 9 seconds
c895bf09456a: Retrying in 8 seconds
c895bf09456a: Retrying in 7 seconds
c895bf09456a: Retrying in 6 seconds
c895bf09456a: Retrying in 5 seconds
c895bf09456a: Retrying in 4 seconds
c895bf09456a: Retrying in 3 seconds
c895bf09456a: Retrying in 2 seconds
c895bf09456a: Retrying in 1 second
c895bf09456a: Retrying in 15 seconds
c895bf09456a: Retrying in 14 seconds
c895bf09456a: Retrying in 13 seconds
c895bf09456a: Retrying in 12 seconds
c895bf09456a: Retrying in 11 seconds
c895bf09456a: Retrying in 10 seconds
c895bf09456a: Retrying in 9 seconds
c895bf09456a: Retrying in 8 seconds
c895bf09456a: Retrying in 7 seconds
c895bf09456a: Retrying in 6 seconds
c895bf09456a: Retrying in 5 seconds
c895bf09456a: Retrying in 4 seconds
c895bf09456a: Retrying in 3 seconds
c895bf09456a: Retrying in 2 seconds
c895bf09456a: Retrying in 1 second
c895bf09456a: Retrying in 20 seconds
c895bf09456a: Retrying in 19 seconds
c895bf09456a: Retrying in 18 seconds
c895bf09456a: Retrying in 17 seconds
c895bf09456a: Retrying in 16 seconds
c895bf09456a: Retrying in 15 seconds
c895bf09456a: Retrying in 14 seconds
c895bf09456a: Retrying in 13 seconds
c895bf09456a: Retrying in 12 seconds
c895bf09456a: Retrying in 11 seconds
c895bf09456a: Retrying in 10 seconds
c895bf09456a: Retrying in 9 seconds
c895bf09456a: Retrying in 8 seconds
c895bf09456a: Retrying in 7 seconds
c895bf09456a: Retrying in 6 seconds
c895bf09456a: Retrying in 5 seconds
c895bf09456a: Retrying in 4 seconds
c895bf09456a: Retrying in 3 seconds
c895bf09456a: Retrying in 2 seconds
c895bf09456a: Retrying in 1 second
received unexpected HTTP status: 504 Gateway Time-out
ERROR: Job failed: command terminated with exit code 1
Trying to push some other project from command line:
75d4c78adb87: Pushed
d8a926acf044: Pushing [================> ] 44.33MB/132.9MB
02665233a680: Pushed
ac788347f35d: Pushed
b6b132e47ed1: Pushed
de242e95f9e6: Pushing [==================================================>] 55.05MB
eab3cc012638: Pushed
eb8c19b0dfbc: Pushing [==================================================>] 45.85MB
97cee2b72194: Pushed
ebf12965380b: Pushing [==================================================>] 4.464MB
(d8a926acf044 keeps retrying, ultimately fails with 504)
registry pod log (timestamps not necessarily the same as the above attempts):
time="2019-05-28T02:19:02.052602654Z" level=error msg="client disconnected during blob PATCH" auth.user.name=fury contentLength=-1 copied=24731056 error="http: unexpected EOF reading trailer" go.version=go1.11.2 http.request.host=registry.mydomain.com http.request.id=7ab88f1b-4e47-491f-aaeb-387ce10a70ce http.request.method=PATCH http.request.remoteaddr=10.233.64.1 http.request.uri="/v2/web/auto-build-image/master/blobs/uploads/705390eb-eb88-423e-9c72-d39433f35ac4?_state=BJG7VilD0PITqxhJw36_4PnZhiYu_56crVZFlxjFkUZ7Ik5hbWUiOiJ3ZWIvYXV0by1idWlsZC1pbWFnZS9tYXN0ZXIiLCJVVUlEIjoiNzA1MzkwZWItZWI4OC00MjNlLTljNzItZDM5NDMzZjM1YWM0IiwiT2Zmc2V0IjowLCJTdGFydGVkQXQiOiIyMDE5LTA1LTI4VDAyOjE2OjI5LjI3MjUyOTM1NFoifQ%3D%3D" http.request.useragent="docker/18.09.6 go/go1.10.8 git-commit/481bc77 kernel/4.9.0-9-amd64 os/linux arch/amd64 UpstreamClient(Docker-Client/18.09.6 \(linux\))" vars.name="web/auto-build-image/master" vars.uuid=705390eb-eb88-423e-9c72-d39433f35ac4
...
10.233.64.117 - - [28/May/2019:02:30:19 +0000] "PATCH /v2/web/auto-build-image/master/blobs/uploads/1c334efa-e7df-43d1-a82b-e92fc7d67de8?_state=5vY-ETO868QRnHC5SwBmCUXvFcQDNzBSUj-8cye0aTp7Ik5hbWUiOiJ3ZWIvYXV0by1idWlsZC1pbWFnZS9tYXN0ZXIiLCJVVUlEIjoiMWMzMzRlZmEtZTdkZi00M2QxLWE4MmItZTkyZmM3ZDY3ZGU4IiwiT2Zmc2V0IjowLCJTdGFydGVkQXQiOiIyMDE5LTA1LTI4VDAyOjMwOjA5LjEzODczNzk0MloifQ%3D%3D HTTP/1.1" 500 89 "" "docker/18.09.6 go/go1.10.8 git-commit/481bc77 kernel/4.9.0-9-amd64 os/linux arch/amd64 UpstreamClient(Docker-Client/18.09.6 \\(linux\\))"
time="2019-05-28T02:35:59.851231763Z" level=error msg="response completed with error" auth.user.name=fury err.code=unknown err.detail="client disconnected" err.message="unknown error" go.version=go1.11.2 http.request.host=registry.mydomain.com http.request.id=d5280657-f0f2-4420-9fed-c34ab26caa04 http.request.method=PATCH http.request.remoteaddr=10.233.64.1 http.request.uri="/v2/web/auto-build-image/master/blobs/uploads/1c334efa-e7df-43d1-a82b-e92fc7d67de8?_state=5vY-ETO868QRnHC5SwBmCUXvFcQDNzBSUj-8cye0aTp7Ik5hbWUiOiJ3ZWIvYXV0by1idWlsZC1pbWFnZS9tYXN0ZXIiLCJVVUlEIjoiMWMzMzRlZmEtZTdkZi00M2QxLWE4MmItZTkyZmM3ZDY3ZGU4IiwiT2Zmc2V0IjowLCJTdGFydGVkQXQiOiIyMDE5LTA1LTI4VDAyOjMwOjA5LjEzODczNzk0MloifQ%3D%3D" http.request.useragent="docker/18.09.6 go/go1.10.8 git-commit/481bc77 kernel/4.9.0-9-amd64 os/linux arch/amd64 UpstreamClient(Docker-Client/18.09.6 \(linux\))" http.response.contenttype="application/json; charset=utf-8" http.response.duration=5m40.543384288s http.response.status=500 http.response.written=89 vars.name="web/auto-build-image/master" vars.uuid=1c334efa-e7df-43d1-a82b-e92fc7d67de8
2019/05/28 02:36:46 http: multiple response.WriteHeader calls
10.233.64.112 - - [28/May/2019:02:31:55 +0000] "PATCH /v2/web/auto-build-image/master/blobs/uploads/a949ebbb-cbd2-4b94-81e7-3b009ea95af1?_state=omQnXuq6SJvTunGR1PM8dBp7E4bYWu3nIs-bsJ3-lNN7Ik5hbWUiOiJ3ZWIvYXV0by1idWlsZC1pbWFnZS9tYXN0ZXIiLCJVVUlEIjoiYTk0OWViYmItY2JkMi00Yjk0LTgxZTctM2IwMDllYTk1YWYxIiwiT2Zmc2V0IjowLCJTdGFydGVkQXQiOiIyMDE5LTA1LTI4VDAyOjMxOjQ5LjkzMjM2NzA1WiJ9 HTTP/1.1" 500 89 "" "docker/18.09.6 go/go1.10.8 git-commit/481bc77 kernel/4.9.0-9-amd64 os/linux arch/amd64 UpstreamClient(Docker-Client/18.09.6 \\(linux\\))"
time="2019-05-28T02:36:46.646680713Z" level=error msg="response completed with error" auth.user.name=fury err.code=unknown err.detail="client disconnected" err.message="unknown error" go.version=go1.11.2 http.request.host=registry.mydomain.com http.request.id=2300b4cc-7f3c-44d5-b370-c210b36fc50e http.request.method=PATCH http.request.remoteaddr=10.233.64.1 http.request.uri="/v2/web/auto-build-image/master/blobs/uploads/a949ebbb-cbd2-4b94-81e7-3b009ea95af1?_state=omQnXuq6SJvTunGR1PM8dBp7E4bYWu3nIs-bsJ3-lNN7Ik5hbWUiOiJ3ZWIvYXV0by1idWlsZC1pbWFnZS9tYXN0ZXIiLCJVVUlEIjoiYTk0OWViYmItY2JkMi00Yjk0LTgxZTctM2IwMDllYTk1YWYxIiwiT2Zmc2V0IjowLCJTdGFydGVkQXQiOiIyMDE5LTA1LTI4VDAyOjMxOjQ5LjkzMjM2NzA1WiJ9" http.request.useragent="docker/18.09.6 go/go1.10.8 git-commit/481bc77 kernel/4.9.0-9-amd64 os/linux arch/amd64 UpstreamClient(Docker-Client/18.09.6 \(linux\))" http.response.contenttype="application/json; charset=utf-8" http.response.duration=4m51.363115069s http.response.status=500 http.response.written=89 vars.name="web/auto-build-image/master" vars.uuid=a949ebbb-cbd2-4b94-81e7-3b009ea95af1
nginx-ingress log: (Where: XXX is the actual IP of the node)
XXX - [XXX] - - [28/May/2019:02:22:19 +0000] "PUT /registry/docker/registry/v2/repositories/web/auto-build-image/master/_uploads/397a7c95-4c6d-446a-ba2f-0702e5855150/startedat HTTP/1.1" 200 0 "-" "aws-sdk-go/1.15.11 (go1.11.2; linux; amd64)" 1093 0.030 [default-gitlab-minio-svc-9000] 10.233.64.139:9000 0 0.028 200 21d1666fff24c3894bf07d2971164211
XXX - [XXX] - - [28/May/2019:02:22:21 +0000] "PUT /registry/docker/registry/v2/repositories/web/auto-build-image/master/_uploads/397a7c95-4c6d-446a-ba2f-0702e5855150/hashstates/sha256/0 HTTP/1.1" 200 0 "-" "aws-sdk-go/1.15.11 (go1.11.2; linux; amd64)" 1192 0.009 [default-gitlab-minio-svc-9000] 10.233.64.139:9000 0 0.008 200 6ae2a7087337387128b6a182e37ae273
XXX - [XXX] - - [28/May/2019:02:22:38 +0000] "POST /api/v4/jobs/request HTTP/1.1" 204 0 "-" "gitlab-runner 11.11.0 (11-11-stable; go1.8.7; linux/amd64)" 917 0.041 [default-gitlab-unicorn-8181] 10.233.64.114:8181 0 0.044 204 f22af74c035dfffce71006e65e200457
2019/05/28 02:23:48 [error] 2252#2252: *28463 upstream timed out (110: Connection timed out) while sending request to upstream, client: 10.233.64.1, server: registry.mydomain.com, request: "PATCH /v2/web/auto-build-image/master/blobs/uploads/397a7c95-4c6d-446a-ba2f-0702e5855150?_state=wL6Un-taQENi2-I9YMYXPyiXM6sqB8t-Yzdi5E1dUPJ7Ik5hbWUiOiJ3ZWIvYXV0by1idWlsZC1pbWFnZS9tYXN0ZXIiLCJVVUlEIjoiMzk3YTdjOTUtNGM2ZC00NDZhLWJhMmYtMDcwMmU1ODU1MTUwIiwiT2Zmc2V0IjowLCJTdGFydGVkQXQiOiIyMDE5LTA1LTI4VDAyOjIyOjE4LjE1MTkzMzgwMloifQ%3D%3D HTTP/1.1", upstream: "http://10.233.64.110:5000/v2/web/auto-build-image/master/blobs/uploads/397a7c95-4c6d-446a-ba2f-0702e5855150?_state=wL6Un-taQENi2-I9YMYXPyiXM6sqB8t-Yzdi5E1dUPJ7Ik5hbWUiOiJ3ZWIvYXV0by1idWlsZC1pbWFnZS9tYXN0ZXIiLCJVVUlEIjoiMzk3YTdjOTUtNGM2ZC00NDZhLWJhMmYtMDcwMmU1ODU1MTUwIiwiT2Zmc2V0IjowLCJTdGFydGVkQXQiOiIyMDE5LTA1LTI4VDAyOjIyOjE4LjE1MTkzMzgwMloifQ%3D%3D", host: "registry.mydomain.com"
10.233.64.1 - [10.233.64.1] - - [28/May/2019:02:23:48 +0000] "PATCH /v2/web/auto-build-image/master/blobs/uploads/397a7c95-4c6d-446a-ba2f-0702e5855150?_state=wL6Un-taQENi2-I9YMYXPyiXM6sqB8t-Yzdi5E1dUPJ7Ik5hbWUiOiJ3ZWIvYXV0by1idWlsZC1pbWFnZS9tYXN0ZXIiLCJVVUlEIjoiMzk3YTdjOTUtNGM2ZC00NDZhLWJhMmYtMDcwMmU1ODU1MTUwIiwiT2Zmc2V0IjowLCJTdGFydGVkQXQiOiIyMDE5LTA1LTI4VDAyOjIyOjE4LjE1MTkzMzgwMloifQ%3D%3D HTTP/1.1" 504 160 "-" "docker/18.09.6 go/go1.10.8 git-commit/481bc77 kernel/4.9.0-9-amd64 os/linux arch/amd64 UpstreamClient(Docker-Client/18.09.6 \x5C(linux\x5C))" 24096768 86.710 [default-gitlab-registry-5000] 10.233.64.110:5000 0 86.707 504 219162a5049d01db9fa5677b9ae0cf68
See my forum post for more info: https://forum.gitlab.com/t/runner-cannot-push-all-layers-to-registry-504-gateway-timeout-self-hosted-kubernetes-cloud-native-helm-chart/26720