Skip to content

AWS IAM: Ensure pre-signed URLs will last at least X minutes

Summary

When using AWS IAM authentication for S3 (Object Storage), the token used for authentication is a temporary one with a limited lifetime (could be low as 1h as an example): https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use.html

When GitLab uses the above type of authentication to generate pre-signed upload/download URLs with an expiration time of 1 day, the following caveat from https://docs.aws.amazon.com/AmazonS3/latest/userguide/ShareObjectPreSignedURL.html begins to apply:

If you created a presigned URL using a temporary token, then the URL expires when the token expires, even if the URL was created with a later expiration time.

This makes it possible for the URL provided to the requestor to be unusable upon arrival or when it is used/reused later (such as through the GitLab Pages' GitLab API cache which keeps reusing the URLs for upto 10 minutes)

Fog, a library GitLab uses for generating pre-signed URLs, has a mechanism within it to ensure that the IAM auth temporary token used is fresh, but its window is very narrow - it will keep using a temporary token as close as 15 seconds to its expiration: https://github.com/fog/fog-aws/blob/4c3c55b32a2e1e6b970caed468178fe39d3a0687/lib/fog/aws/credential_fetcher.rb#L126-L130

In the worst case, due to the above, a pre-signed URL may become unusable within 15 seconds of it being provided to any requestor.

The pre-signed URLs expiring forcibly due to use of IAM results in odd errors in downstream services such as the one described in issue gitlab-pages#686 (comment 807801966)

Steps to reproduce

Example Project

What is the current bug behavior?

Pre-signed URLs are not guaranteed to stay alive until their requested expiration time and can expire in as low as 15 seconds after issue.

What is the expected correct behavior?

Pre-signed URLs should be guaranteed to stay alive until at least a longer defined time period than 15 seconds.

For example, at least 15 minutes, to comfortably accommodate GitLab Pages' default of 10 minute URL cache/reuse.

Relevant logs and/or screenshots

See issue gitlab-pages#686 (comment 807801966) or customer ticket https://gitlab.zendesk.com/agent/tickets/256736

Output of checks

This was observed on instances running GitLab 14.4 and 14.5.

Possible fixes

One idea is to change the time threshold in Fog library upstream: https://github.com/fog/fog-aws/blob/master/lib/fog/aws/credential_fetcher.rb#L126-L130

Another is to track the expiration time ourselves and reload the token (if possible) in all its active users.