Disable `rake gitlab:cleanup:remote_upload_files` with bucket prefix
What does this MR do and why?
In GitLab 15.0 !91307 (merged) added official support for configuring an object storage bucket with a prefix. However, this Rake task doesn't take this bucket prefix into account and attempts to iterate through all files in the bucket. If the dry-run flag is disabled, this Rake task also moves all files into the lost and found directory.
Unfortunately, it does not appear Fog provides an easy, cloud-agnostic way to list all files in a bucket with a prefix filter. In addition, at least for Azure Blob Storage, there isn't a standardized method to distinguish a directory from a regular file using Fog.
For these reasons, this commit disables this Rake task if a prefix is configured to prevent data loss.
This Rake task should probably be dropped for a number of reasons:
- It's not used very much.
- It requires bucket permissions to list all files. Our documented permissions for object storage buckets don't grant these privileges.
- It requires walking through the entire bucket and doing a database query for each batch size. This is quite slow, and it doesn't scale well as more objects are added.
Relates to https://gitlab.com/gitlab-org/gitlab/-/issues/415537
How to set up and validate locally
- Configure object storage with a bucket prefix. For example, in
gdk.yml:
object_store:
connection:
provider: AzureRM
azure_storage_account_name: REDACTED-STORAGE
azure_storage_access_key: REDACTED-KEY
consolidated_form: true
enabled: true
objects:
artifacts:
bucket: test1/artifacts
external_diffs:
bucket: test1/external_diffs
lfs:
bucket: test1/lfs
uploads:
bucket: test1/uploads
packages:
bucket: test1/packages
dependency_proxy:
bucket: test1/dependency-proxy
terraform_state:
bucket: test1/terraform
pages:
bucket: test1/pages
ci_secure_files:
bucket: test1/ci_secure_files
- Run
gdk reconfigure. - Run
bin/rake gitlab:cleanup:remote_upload_files.
You should see:
% bin/rake gitlab:cleanup:remote_upload_files
rake aborted!
Uploads are configured with a bucket prefix 'uploads'.
Unfortunately, prefixes are not supported for this Rake task.
/Users/stanhu/gdk-ee/gitlab/lib/gitlab/cleanup/remote_uploads.rb:24:in `run!'
/Users/stanhu/gdk-ee/gitlab/lib/tasks/gitlab/cleanup.rake:47:in `block (3 levels) in <main>'
Tasks: TOP => gitlab:cleanup:remote_upload_files
(See full trace by running task with --trace)
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.