Skip to content

AKS with Azure Blob Storage loses the top level folder in blob containers during a restore

Summary

When using Azure AKS with Blob Storage and you perform a restore, the top level directory in each blob storage container is lost. This results in a broken GitLab install.

Workaround

#4682 (comment 1382359254)

Steps to reproduce

  1. Setup GitLab Chart within an AKS cluster with Blob Storage.

  2. In a project, create an issue and upload some images to the description (this creates upload objects)

  3. Run a backup

  4. Take note of where the files live in the gitlab-uploads blob container (the @hashed folder)

    image

  5. (Optional) Delete the contents of the gitlab-uploads blob container

  6. Run a restore using the backup tarball generated previously

  7. Take note of where the files live in the gitlab-uploads blob container (the @hashed folder is now missing and we see the 4b folder instead, which was originally inside @hashed):

    image

Configuration used

(Please provide a sanitized version of the configuration used wrapped in a code block (```yaml))

certmanager-issuer:
  email: <sanitized>
gitlab:
  toolbox:
    backups:
      cron:
        persistence:
          enabled: true
      objectStorage:
        backend: azure
        config:
          key: connection
          secret: backup-azure-creds
    persistence:
      enabled: true
global:
  appConfig:
    backups:
      bucket: gitlab-backups
      tmpBucket: tmp
    object_store:
      connection:
        secret: gitlab-rails-storage
      enabled: true
  hosts:
    domain: <sanitized>
    externalIP: <sanitized>
  minio:
    enabled: false
postgresql:
  image:
    tag: 13.6.0

Current behavior

When using AKS with Blob Storage, restored objects in blob containers are missing the top level directory which breaks the restored GitLab environment.

Expected behavior

When using AKS with Blob Storage, restored objects in blob containers should retain the top level directory.

Versions

  • Chart: 6.11.2 / 15.11.2
  • Platform:
    • Cloud: AKS
  • Kubernetes: (kubectl version)
    • Client: 1.25
    • Server: 1.25
  • Helm: (helm version)
    • Client: 3.10.2
    • Server:

Relevant logs

This appears to be a problem with the underlying azcopy sync command as this command does not seem to copy the parent directory specified https://gitlab.com/gitlab-org/build/CNG/-/blob/aed8365644d1f40893585433cc028a316988815a/gitlab-toolbox/scripts/lib/object_storage_backup.rb#L111-113

The paths sent to azcopy are generated here in Dir.glob("#{extracted_tar_path}/*") https://gitlab.com/gitlab-org/build/CNG/-/blob/aed8365644d1f40893585433cc028a316988815a/gitlab-toolbox/scripts/lib/object_storage_backup.rb#L184-186

So in the reproduction steps above, this would have generated something like /srv/gitlab/tmp/uploads/@hashed, along with any other top level folders from /srv/gitlab/tmp/uploads/*.

If we run this locally with an uploads.tar.gz extracted from a backup tarball:

❯ azcopy sync "uploads/@hashed" https://<storage-account>.blob.core.windows.net/gitlab-uploads/\?<sas-token>

The result is that the @hashed folder is not uploaded into the blob container but it's contents are.

As an alternative, the azcopy copy --recursive command might work better here.

Edited by Peter Lu