gitlab-backup utility fails to backup some registry objects because crcmod is not installed
Summary
gitlab-backup
utility fails to backup some registry object because crcmod is not enabled in task-runner
pod.
Steps to reproduce
Customer uses GCS as a registry storage. When creating a backup using backup-utility
, there are 102 registry objects for which the backup throws the message below:
CommandException:
Downloading this composite object requires integrity checking with CRC32c,
but your crcmod installation isn't using the module's C extension, so the
hash computation will likely throttle download performance. For help
installing the extension, please see "gsutil help crcmod".
To download regardless of crcmod performance or to skip slow integrity
checks, see the "check_hashes" option in your boto config file.
NOTE: It is strongly recommended that you not disable integrity checks. Doing so
could allow data corruption to go undetected during uploading/downloading.
Copying gs://gitlab-registry-prod-772gc/docker/registry/v2/blobs/sha256/00/00983af1e2784562dbcd5738f26edad56024676d195439d8b98c8b10c35f7da5/data...
==> NOTE: You are downloading one or more large file(s), which would
run significantly faster if you enabled sliced object downloads. This
feature is enabled by default but requires that compiled crcmod be
installed (see "gsutil help crcmod").
In the end, the backup fails with CommandException: 102 files/objects could not be copied/removed.
Configuration used
Expand to review `values.yaml` provided by the customer
USER-SUPPLIED VALUES: ' gitlab': task-runner: backups: cron: extraArgs: --skip registry certmanager-issuer: email: gitlab.admin@REDACTED gitlab: gitaly: persistence: enabled: true storageClass: ssd task-runner: backups: cron: enabled: true persistence: enabled: true size: 1680Gi storageClass: ssd schedule: 5 2 * * * objectStorage: backend: gcs config: gcpProject: REDACTED key: config secret: storage-config persistence: enabled: true size: 1250Gi storageClass: ssd unicorn: ingress: annotations: '"nginx.ingress.kubernetes.io/proxy-body-size"': 2048m '"nginx.ingress.kubernetes.io/proxy-connect-timeout"': 60 gitlab-runner: checkInterval: 10 concurrent: 50 install: true metrics: enabled: true rbac: create: true runners: builds: cpuLimit: "1" cpuRequests: 100m memoryLimit: 2048Mi memoryRequests: 512Mi cache: cacheShared: true cacheType: gcs gcsBucketname: gitlab-runner-cache-prod-772gc secretName: google-application-credentials helpers: cpuLimit: 200m cpuRequests: 25m image: gitlab/gitlab-runner-helper:x86_64-latest memoryLimit: 512Mi memoryRequests: 256Mi locked: false outputLimit: 8192 privileged: false requestConcurrency: 25 runUntagged: true services: cpuLimit: 200m cpuRequests: 50m memoryLimit: 512Mi memoryRequests: 128Mi tags: cloud unregisterRunners: true global: appConfig: artifacts: bucket: gitlab-artifacts-prod-772gc connection: key: connection secret: gitlab-rails-storage backups: bucket: gitlab-backups-prod-772gc tmpBucket: gitlab-tmp-storage-prod-772gc enableUsagePing: true extra: googleAnalyticsId: REDACTED incomingEmail: address: gitlab@REDACTED enabled: false host: REDACTED password: secret: gitlab-smtp port: 143 ssl: false ldap: servers: main: active_directory: true allow_username_or_email_login: false attributes: email: userPrincipalName first_name: givenName last_name: sn name: cn username: sAMAccountName base: REDACTED bind_dn: REDACTED host: REDACTED label: Gitlab AD lowercase_usernames: false method: plain password: secret: ldap-main-password port: 3268 timeout: 20 uid: sAMAccountName user_filter: verify_certificates: false lfs: bucket: gitlab-lfs-prod-772gc connection: key: connection secret: gitlab-rails-storage omniauth: allowBypassTwoFactor: [] allowSingleSignOn: - azure_oauth2 autoLinkLdapUser: true autoLinkSamlUser: false autoSignInWithProvider: null blockAutoCreatedUsers: false enabled: true externalProviders: [] providers: - key: provider secret: azuread-secret syncProfileAttributes: - name - email syncProfileFromProvider: - azure_oauth2 packages: bucket: gitlab-packages-prod-772gc connection: key: connection secret: gitlab-rails-storage pseudonymizer: bucket: gitlab-pseudonymizer-prod-772gc connection: key: connection secret: gitlab-rails-storage uploads: bucket: gitlab-uploads-prod-772gc connection: key: connection secret: gitlab-rails-storage edition: ee email: from: gitlab@REDACTED reply_to: gitlab@REDACTED gitaly: internal: names: - default - large grafana: enabled: true hosts: domain: REDACTED externalIP: REDACTED https: true registry: name: registry.gitlab.REDACTED ssh: null ingress: configureCertmanager: true enabled: true tls: enabled: true minio: enabled: false operator: enabled: false prometheus: persistence: enabled: true storageClass: ssd psql: database: gitlabhq-prod host: REDACTED password: key: password secret: gitlab-pg port: 5432 username: gitlab_user rails: bootsnap: enabled: true redis: host: REDACTED password: enabled: false registry: bucket: gitlab-registry-prod-772gc smtp: address: smtp.REDACTED authentication: plain enabled: true openssl_verify_mode: none password: secret: gitlab-smtp port: 587 starttls_auto: true user_name: postmaster@REDACTED unicorn: replicaCount: 4 postgresql: install: false prometheus: install: true redis: install: false registry: ingress: annotations: nginx.ingress.kubernetes.io/proxy-body-size: 0 nginx.ingress.kubernetes.io/proxy-buffering: "off" nginx.ingress.kubernetes.io/proxy-connect-timeout: 180 nginx.ingress.kubernetes.io/proxy-read-timeout: 900 nginx.ingress.kubernetes.io/proxy-request-buffering: "off" storage: extraKey: gcs.json key: storage secret: gitlab-registry-storage
Current behavior
Backup of gcs registry fails with the error mentioned above.
Expected behavior
Backup of gcs registry should be completed successfully.
Versions
- Chart: 3.3.2
- Platform:
- Cloud: (GKE | AKS | EKS | ?)
- Self-hosted: (OpenShift | Minikube | Rancher RKE | ?)
- Kubernetes:
- Client: 1.16
- Server: 1.14
- Helm: 3.1.2
Investigation
- Tried to install crcmod inside the task-runner pod by following the steps from
gsutil help crcmod
for various Python versions butgsutil version -l
still showscompiled crcmod: False
after that, and the backup fails with the same error. - If one tries to copy one of the problem objects directly via
gsutil cp
inside the pod, it fails in the same way. - If one tries to copy it on another machine where crcmod is installed, it works fine.
- Therefore, it looks like we need a way to install
crcmod
inside thetask-runner
pod as a workaround and likely install it there by default in future versions.
Edited by Alexandr Tanayno