AKS with Azure Blob Storage loses the top level folder in blob containers during a restore
Summary
When using Azure AKS with Blob Storage and you perform a restore, the top level directory in each blob storage container is lost. This results in a broken GitLab install.
Workaround
Steps to reproduce
-
Setup GitLab Chart within an AKS cluster with Blob Storage.
-
In a project, create an issue and upload some images to the description (this creates upload objects)
-
Run a backup
-
Take note of where the files live in the
gitlab-uploads
blob container (the@hashed
folder) -
(Optional) Delete the contents of the
gitlab-uploads
blob container -
Run a restore using the backup tarball generated previously
-
Take note of where the files live in the
gitlab-uploads
blob container (the@hashed
folder is now missing and we see the4b
folder instead, which was originally inside@hashed
):
Configuration used
(Please provide a sanitized version of the configuration used wrapped in a code block (```yaml))
certmanager-issuer:
email: <sanitized>
gitlab:
toolbox:
backups:
cron:
persistence:
enabled: true
objectStorage:
backend: azure
config:
key: connection
secret: backup-azure-creds
persistence:
enabled: true
global:
appConfig:
backups:
bucket: gitlab-backups
tmpBucket: tmp
object_store:
connection:
secret: gitlab-rails-storage
enabled: true
hosts:
domain: <sanitized>
externalIP: <sanitized>
minio:
enabled: false
postgresql:
image:
tag: 13.6.0
Current behavior
When using AKS with Blob Storage, restored objects in blob containers are missing the top level directory which breaks the restored GitLab environment.
Expected behavior
When using AKS with Blob Storage, restored objects in blob containers should retain the top level directory.
Versions
- Chart:
6.11.2
/15.11.2
- Platform:
- Cloud: AKS
- Kubernetes: (
kubectl version
)- Client: 1.25
- Server: 1.25
- Helm: (
helm version
)- Client: 3.10.2
- Server:
Relevant logs
This appears to be a problem with the underlying azcopy sync
command as this command does not seem to copy the parent directory specified https://gitlab.com/gitlab-org/build/CNG/-/blob/aed8365644d1f40893585433cc028a316988815a/gitlab-toolbox/scripts/lib/object_storage_backup.rb#L111-113
The paths sent to azcopy
are generated here in Dir.glob("#{extracted_tar_path}/*")
https://gitlab.com/gitlab-org/build/CNG/-/blob/aed8365644d1f40893585433cc028a316988815a/gitlab-toolbox/scripts/lib/object_storage_backup.rb#L184-186
So in the reproduction steps above, this would have generated something like /srv/gitlab/tmp/uploads/@hashed
, along with any other top level folders from /srv/gitlab/tmp/uploads/*
.
If we run this locally with an uploads.tar.gz
extracted from a backup tarball:
❯ azcopy sync "uploads/@hashed" https://<storage-account>.blob.core.windows.net/gitlab-uploads/\?<sas-token>
The result is that the @hashed
folder is not uploaded into the blob container but it's contents are.
As an alternative, the azcopy copy --recursive
command might work better here.