Geo: Document backups --s3tool awscli option for both manual and cron run jobs
Notice: this issue is more related to https://gitlab.com/gitlab-org/build/CNG/-/tree/master/gitlab-toolbox, but I don't have permission to create issue there.
Summary
We configures GitLab to use an s3 bucket as CI job artifact storage. The backup-utility
uses s3cmd
to download objects from S3 buckets, which, in our case, is less reliable than awscli
. Most of our backup job crashed with error:
WARNING: Remote file ''. S3Error: 404 (NoSuchKey): The specified key does not exist.
ERROR: S3 error: 404 (NoSuchKey): The specified key does not exist.
This is probably related to how s3cmd
works internally. On start, this tool first scans the whole bucket and build a file list in memory. It only starts downloading objects after finishing a full scan: https://github.com/s3tools/s3cmd/blob/v2.2.0/s3cmd#L1341
Per documentation, sidekiq deletes expired artifacts every 7 minutes: https://docs.gitlab.com/ee/administration/job_artifacts.html#expiring-artifacts.
The problem, I guess, is that unless the backup job can finish in 7 minutes, we will end up having some files deleted from remote bucket before s3cmd
has a chance to read them from the existing in-memory list and download it.
It also worth to note that, the backup-utility uses tar
to pack up everything into a single tar ball and upload that. The packaging step uses a lot of memory. It may worth to look into splitting the backup file into multiple tarballs and/or applying compression.
Steps to reproduce
(Please provide the steps to reproduce the issue)
I haven't gotten a chance to confirm my theory. But we switched from s3cmd
to awscli
and our backup job can run successfully without problem. I think awscli
runs scan and download in parallel, so it is less likely to run into s3cmd
's problem.
Configuration used
(Please provide a sanitized version of the configuration used wrapped in a code block (```yaml))
between working and not-working, the only difference is the choice of s3tool used by backup-utility
:
Not working:
- args:
- /bin/bash
- '-c'
- cp /etc/gitlab/.s3cfg $HOME/.s3cfg && backup-utility
Working:
- args:
- /bin/bash
- '-c'
- cp /etc/gitlab/.s3cfg $HOME/.s3cfg && backup-utility --s3tool awscli
Current behavior
Without explicitly setting --s3tool
to be awscli
, backup job crashes with error:
WARNING: Remote file ''. S3Error: 404 (NoSuchKey): The specified key does not exist.
ERROR: S3 error: 404 (NoSuchKey): The specified key does not exist.
Expected behavior
The backup job should not panic when some objects disappear from the remote bucket. It is normal for artifacts to be deleted from S3 bucket.
Versions
- Chart: 5.9.1 (gitlab version: 14.9.1 )
- Platform:
- Cloud: EKS
- Kubernetes: (
kubectl version
)- Client:
- Server:
- Helm: (
helm version
)- Client:
- Server:
Relevant logs
(Please provide any relevant log snippets you have collected, using code blocks (```) to format)