Skip to content

Use AWS SDK via Go Cloud for S3 cache uploads for shell executor

Stan Hu requested to merge sh-gocloud-shell-executor into main

What does this MR do?

This merge request adds support for using the AWS SDK via the Go Cloud API for the cache archiver and downloader only for the shell executor.

This feature is disabled by default under the FF_USE_GO_CLOUD_S3_CACHE_UPLOADS feature flag.

Why was this MR needed?

The current mechanism of using pre-signed URLs is limited to 5 GB and slower than using a parallel, multipart upload mechanism that is implemented in the AWS SDK.

When IAM instance profile credentials are used, the temporary credentials are retrieved by the AWS SDK inside the cache-archiver.

When static credentials are used, these credentials (AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY) are passed along to GoCloud via environment variables. To avoid leaking these variables to the user, for the shell executor, these environment variables are injected into the environment only for the cache archiver/retrieval stages.

This merge request was created from the work in !2639 (closed).

What's the best way to test this MR?

  1. Enable FF_USE_GO_CLOUD_S3_CACHE_UPLOADS: "true" in a CI job.
  2. Configure the runner to use S3. Unlike Minio, BucketLocation is required.
  3. Kick off a CI image that uses this:
image: alpine:latest
variables:
  FF_USE_GO_CLOUD_S3_CACHE_UPLOADS: "false"

test:
  script:
    - date
    - mkdir -p gitlab-chart/gitlab
    - echo "hello2" > gitlab-chart/gitlab/test.txt
  cache:
    paths:
      - gitlab-chart

What are the relevant issue numbers?

Relates to #26921

Edited by Stan Hu

Merge request reports