S3 cache with RoleARN returns 403 instead of 404 for non-existent cache

Summary

When RoleARN is configured for S3 cache with AuthenticationType = "iam", GitLab Runner returns a 403 Forbidden error instead of the expected 404 Not Found when checking for non-existent cache.

This issue does not occur when RoleARN is removed from the configuration - the runner correctly receives 404 for non-existent cache. AWS CLI with the same RoleARN configuration also correctly returns 404, confirming this is a runner-specific issue.

Steps to reproduce

  1. Configure GitLab Runner with S3 cache using RoleARN:
[runners.cache]
Type = "s3"
Shared = true
[runners.cache.s3]
BucketName = "<bucket-name>"
BucketLocation = "us-east-1"
AuthenticationType = "iam"
RoleARN = "arn:aws:iam::<account-id>:role/gitlab-runners-cache-sa"
  1. Run a pipeline where the cache does not exist yet
  2. Observe the runner returns 403 Forbidden for the non-existent cache

Expected behavior

When checking for non-existent cache with RoleARN configured:

  • Runner should receive 404 Not Found response (same as when RoleARN is not configured)
  • Runner should proceed normally with cache creation
  • Behavior should match AWS CLI when using the same RoleARN

Actual behavior

With RoleARN configured:

  • Runner receives 403 Forbidden instead of 404 Not Found for non-existent cache
  • This breaks normal cache workflow

Without RoleARN configured:

  • Runner correctly receives 404 Not Found for non-existent cache ✓

Using AWS CLI with same RoleARN:

  • Correctly receives 404 Not Found for non-existent cache ✓

Relevant logs and/or screenshots

  • Without RoleARN:
Checking cache for test-cache-protected...
WARNING: file does not exist                       
Failed to extract cache
  • WIth RoleARN:
WARNING: blob (key "project/19/test-cache-protected-protected") (code=Unknown): operation error S3: HeadObject, https response error StatusCode: 403, RequestID: <>, HostID: <> api error Forbidden: Forbidden 
Failed to extract cache

Environment

  • GitLab Runner deployed on EKS
  • Runner Version: 18.4.3
  • S3 cache with IAM authentication
  • Verified with AWS CLI that the RoleARN has correct permissions

Impact

This issue does not break anything but causes confusion when checking cache for the first time. When RoleARN is not present, it gives expected output, but blocks use cases that require:

  • Multipart uploads (which require RoleARN)

Related Issues

Related to #38484 (closed) (S3 Express with RoleARN)

Edited by Govind Kumar