Completed pending direct uploads are not being deleted from redis hash.
This is a follow-up from https://gitlab.com/gitlab-org/gitlab/-/merge_requests/123615#note_1468205110. There have been reports that some self-managed instances are encountering problems with their uploads being deleted by the stale uploads clean up worker. As a sample from https://gitlab.com/gitlab-org/gitlab/-/merge_requests/123615#note_1472507632: > We see that the upload seem to completed successfully in the logs: `Pending direct upload completed` message gets written at `2023-07-13T06:59:28.588Z`, however, it gets removed as stale anyway at `2023-07-13T10:00:06.902Z` anyway. > > ``` > gitlab-rails/application_json.log:{"severity":"INFO","time":"2023-07-13T06:59:28.588Z","correlation_id":"01H570RBB4RGWTHT78KC73C66Y","meta.caller_id":"POST /api/:version/jobs/:id/artifacts","meta.remote_ip":"<SANITIZED>","meta.feature_category":"build_artifacts","meta.user":"<SANITIZED>","meta.user_id":8,"meta.project":"<SANITIZED>","meta.root_namespace":"<SANITIZED>","meta.client_id":"user/8","meta.pipeline_id":22207,"meta.job_id":182119,"message":"Pending direct upload completed","redis_key":"artifacts:<PROJECT_PATH_SANITIZED>/@final/d6/c8/bdf63308e681fceef44a8087428209961e850f98f5a1dc991930f78c23c8"} > gitlab-rails/application_json.log:{"severity":"INFO","time":"2023-07-13T10:00:06.902Z","meta.caller_id":"ObjectStorage::DeleteStaleDirectUploadsWorker","correlation_id":"5e5bd4875a4970fa3648b38cc7ffeee6","meta.root_caller_id":"Cronjob","meta.feature_category":"build_artifacts","meta.client_id":"ip/","message":"Pending direct upload deleted","redis_key":"artifacts:<PROJECT_PATH_SANITIZED>/@final/d6/c8/bdf63308e681fceef44a8087428209961e850f98f5a1dc991930f78c23c8"} > ``` > > The customer seems to have many `pending_entries` and every time the worker removes several artifacts: > ``` > "extra.object_storage_delete_stale_direct_uploads_worker.total_pending_entries": 556, > "extra.object_storage_delete_stale_direct_uploads_worker.total_deleted_stale_entries": 6, > "extra.object_storage_delete_stale_direct_uploads_worker.execution_timeout": false, > ``` Given there's a log entry for `Pending direct upload completed` which is logged [here](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/object_storage/pending_direct_upload.rb#L31) when the upload is [completed](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/uploaders/object_storage.rb#L34), the suspect cause here is that the call to `redis.hdel(KEY, key)` is not properly removing the entry. Redis does not raise an error when a given field does not exist but rather just return 0, so for now we are assuming that it is not finding the given key to delete. With that said, what we are trying to figure out is why is the given key or [file.path](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/uploaders/object_storage.rb#L34) does not seem to match the entry during `redis.hdel`.
issue