Use AWS SDK via Go Cloud for S3 cache uploads for shell executor
What does this MR do?
This merge request adds support for using the AWS SDK via the Go Cloud API for the cache archiver and downloader only for the shell executor.
This feature is disabled by default under the FF_USE_GO_CLOUD_S3_CACHE_UPLOADS
feature flag.
Why was this MR needed?
The current mechanism of using pre-signed URLs is limited to 5 GB and slower than using a parallel, multipart upload mechanism that is implemented in the AWS SDK.
When IAM instance profile credentials are used, the temporary credentials are retrieved by the AWS SDK inside the cache-archiver.
When static credentials are used, these credentials (AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
) are passed along to GoCloud via environment
variables. To avoid leaking these variables to the user, for the shell
executor, these environment variables are injected into the environment
only for the cache archiver/retrieval stages.
This merge request was created from the work in !2639 (closed).
What's the best way to test this MR?
- Enable
FF_USE_GO_CLOUD_S3_CACHE_UPLOADS: "true"
in a CI job. - Configure the runner to use S3. Unlike Minio,
BucketLocation
is required. - Kick off a CI image that uses this:
image: alpine:latest
variables:
FF_USE_GO_CLOUD_S3_CACHE_UPLOADS: "false"
test:
script:
- date
- mkdir -p gitlab-chart/gitlab
- echo "hello2" > gitlab-chart/gitlab/test.txt
cache:
paths:
- gitlab-chart
What are the relevant issue numbers?
Relates to #26921