Completed pending direct uploads are not being deleted from redis hash.
This is a follow-up from !123615 (comment 1468205110).
There have been reports that some self-managed instances are encountering problems with their uploads being deleted by the stale uploads clean up worker.
As a sample from !123615 (comment 1472507632):
We see that the upload seem to completed successfully in the logs:
Pending direct upload completed
message gets written at2023-07-13T06:59:28.588Z
, however, it gets removed as stale anyway at2023-07-13T10:00:06.902Z
anyway.gitlab-rails/application_json.log:{"severity":"INFO","time":"2023-07-13T06:59:28.588Z","correlation_id":"01H570RBB4RGWTHT78KC73C66Y","meta.caller_id":"POST /api/:version/jobs/:id/artifacts","meta.remote_ip":"<SANITIZED>","meta.feature_category":"build_artifacts","meta.user":"<SANITIZED>","meta.user_id":8,"meta.project":"<SANITIZED>","meta.root_namespace":"<SANITIZED>","meta.client_id":"user/8","meta.pipeline_id":22207,"meta.job_id":182119,"message":"Pending direct upload completed","redis_key":"artifacts:<PROJECT_PATH_SANITIZED>/@final/d6/c8/bdf63308e681fceef44a8087428209961e850f98f5a1dc991930f78c23c8"} gitlab-rails/application_json.log:{"severity":"INFO","time":"2023-07-13T10:00:06.902Z","meta.caller_id":"ObjectStorage::DeleteStaleDirectUploadsWorker","correlation_id":"5e5bd4875a4970fa3648b38cc7ffeee6","meta.root_caller_id":"Cronjob","meta.feature_category":"build_artifacts","meta.client_id":"ip/","message":"Pending direct upload deleted","redis_key":"artifacts:<PROJECT_PATH_SANITIZED>/@final/d6/c8/bdf63308e681fceef44a8087428209961e850f98f5a1dc991930f78c23c8"}
The customer seems to have many
pending_entries
and every time the worker removes several artifacts:"extra.object_storage_delete_stale_direct_uploads_worker.total_pending_entries": 556, "extra.object_storage_delete_stale_direct_uploads_worker.total_deleted_stale_entries": 6, "extra.object_storage_delete_stale_direct_uploads_worker.execution_timeout": false,
Given there's a log entry for Pending direct upload completed
which is logged here when the upload is completed, the suspect cause here is that the call to redis.hdel(KEY, key)
is not properly removing the entry. Redis does not raise an error when a given field does not exist but rather just return 0, so for now we are assuming that it is not finding the given key to delete. With that said, what we are trying to figure out is why is the given key or file.path does not seem to match the entry during redis.hdel
.