Use GoCloud URLs for Azure downloads

What does this MR do?

Previously pre-signed URLs were generated for Azure downloads by the runner manager, while GoCloud handled the uploads. This works well for Azure Managed Identities on a virtual machine, but in Kubernetes this means that users have to ensure that the service account used by the runner manager is configured to support Azure Workload Identities. We can simplify this by making cache-extractor use GoCloud URLs in the same way cache-archiver uses them.

To support this, this merge request does a number of things:

  • Drops the use of legacy presigned URLs for Azure downloads. Now all Azure downloads are served through GoCloud, though the Azure SAS token is passed along if the account name and key are supplied.

  • Drops the use of the unused presigned URLs for Azure uploads. Azure uploads have always been handled via GoCloud because the simple HTTP pre-signed URL approach is limited to 5 MB uploads.

  • Adds support to cache extractor to use a GoCloud URL for Azure.

  • Updates the GetGoCloudURL interface to return an error, the environment variables needed (e.g. AZURE_SAS_TOKEN), and whether the request is an upload or a download. The latter parameter is needed for S3 since the UploadARN config setting is only used for uploads.

What's the best way to test this MR?

  1. Check out this branch and run make runner-and-helper-bin-host.

  2. Set up an Azure cache with an AccountName and AccountKey:

  [runners.cache]
    Type = "azure"
    MaxUploadedArchiveSize = 0
  [runners.cache.azure]
    AccountName = "YOUR-ACCOUNT-NAME"
    AccountKey = "SOME-ACCOUNT-KEY"
    ContainerName = "test1"
    StorageDomain = "blob.core.windows.net"
  1. Run a CI Job:
default:
  script:
    - echo "hello world" > test.txt
  cache:
    paths:
      - test.txt
  1. Run this job and observe that the Downloading cache message uses azblob://, such as:
Downloading cache from azblob://test1/runner/BNu9my98D/project/3/default-protected  ETag="0x8DD13C45C58037A"

Repeat the tests with other configurations. For example,

  1. Set up an Azure cache with a managed identity:
  [runners.cache]
    Type = "azure"
    MaxUploadedArchiveSize = 0
  [runners.cache.azure]
    AccountName = "YOUR-ACCOUNT-NAME"
    ContainerName = "test1"
    StorageDomain = "blob.core.windows.net"
  1. Set up an S3 cache with UploadARN:
  [runners.cache]
    Type = "s3"
    MaxUploadedArchiveSize = 0
    [runners.cache.s3]
    UploadRoleARN = "arn:aws:iam::<some role>s3-upload-test-role"
    BucketName = "YOUR-BUCKET"
    BucketLocation = "YOUR-LOCATION"
    ServerAddress = "s3.amazonaws.com"

What are the relevant issue numbers?

Relates to #38330 (closed)

Edited by Stan Hu

Merge request reports

Loading