Skip to content

Support Google CDN for Web UI artifacts downloads

Stan Hu requested to merge sh-enable-google-cdn-web-ui-downloads into master

What does this MR do and why?

This is a follow-up merge request to !98010 (merged), which enabled job artifacts to be served by Google Cloud CDN for specific API endpoints. This extends this functionality to artifact downloads from the Web UI via the Download artifact button.

This functionality is behind the use_cdn_with_job_artifacts_ui_downloads feature flag. This feature should not only make artifacts downloads faster for users, but it also helps reduce network egress bandwidth costs.

Since UI downloads can have response-content-disposition and response-content-type query parameters, we also need to modify the Cloud CDN implementation to allow passing of query parameters to Google Cloud Storage. Note that Cloud CDN requires that the Expires, KeyName, and Signature query parameters appear at the end of the URL, or the request will fail with a 403 error.

Relates to #378473 (closed)

How to set up and validate locally

As described in https://gitlab.com/gitlab-org/container-registry/-/issues/535#note_792288038:

Setting up Google CDN

  1. Created an GCS test bucket.
  2. Followed https://cloud.google.com/cdn/docs/setting-up-cdn-with-bucket to create an HTTPS load balancer with a static IP. I let Google create the HTTPS certs and assigned the domain stanhu-cdn.example.org.
  3. Registered the load balancer IP with that domain.
  4. Continued https://cloud.google.com/cdn/docs/using-signed-urls with registering a signing key and giving permissions to the bucket.

Testing this merge request

  1. Create a GCS VM and install the latest GitLab nightly build.
  2. Per https://docs.gitlab.com/ee/administration/object_storage.html#google-example-with-adc-consolidated-form, I had to stop the VM and grant it Allow full access to all Cloud APIs.
  3. Tweaked the default service account permissions by limiting access with Service Account Token Creator and giving it access to read/write storage buckets.
  4. Enabled IAM Service Account Credentials API in https://console.cloud.google.com/apis/library/iamcredentials.googleapis.com. (This wasn't documented; I ran into error messages before I enabled it).
  5. Download the latest nightly build and apply this patch. In my Omnibus config, I have:
external_url 'https://gitlab.example.com'
gitlab_rails['object_store']['enabled'] = true
gitlab_rails['object_store']['connection'] = {
    'provider' => 'Google',
    'google_project' => 'stan-redacted',
    'google_application_default' => true
}
gitlab_rails['object_store']['proxy_download'] = false

bucket = 'stanhu-test'
gitlab_rails['object_store']['objects']['artifacts']['bucket'] = "#{bucket}/artifacts"

gitlab_rails['object_store']['objects']['artifacts']['cdn'] = {
  'provider' => 'Google',
  'url' => 'https://stanhu-cdn.example.org',
  'key_name' => 'stanhu-key',
  'key' => '<REDACTED KEY>'
}

gitlab_rails['object_store']['objects']['external_diffs']['bucket'] = "#{bucket}/external_diffs"
gitlab_rails['object_store']['objects']['lfs']['bucket'] = "#{bucket}/lfs"
gitlab_rails['object_store']['objects']['uploads']['bucket'] = "#{bucket}/uploads"
gitlab_rails['object_store']['objects']['packages']['bucket'] = "#{bucket}/packages"
gitlab_rails['object_store']['objects']['dependency_proxy']['bucket'] = "#{bucket}/dependency_proxy"
gitlab_rails['object_store']['objects']['terraform_state']['bucket'] = "#{bucket}/terraform_state"
gitlab_rails['object_store']['objects']['ci_secure_files']['bucket'] = "#{bucket}/ci_secure_files"
  1. Enable the feature flag: Feature.enable(:use_cdn_with_job_artifacts_ui_downloads).

  2. Run a CI job that generates an artifact. Simple example:

image: ruby:latest

stages:
  - test

test:
  stage: test
  script:
    - echo "hello" > test.txt
  artifacts:
    paths:
      - test.txt
  1. Open the browser's Inspect, and go to the job and attempt to click Download on the artifact:

image

  1. If the CDN is used, you should see the 302 go to the CDN domain instead of storage.googleapis.com:

image

    1. Check that /var/log/gitlab/gitlab-rails/production_json.log has meta.artifact_used_cdn:
  "meta.caller_id": "Projects::ArtifactsController#download",
  "meta.remote_ip": "192.184.151.106",
  "meta.feature_category": "build_artifacts",
  "meta.user": "root",
  "meta.project": "root/simple-ci",
  "meta.root_namespace": "root",
  "meta.client_id": "user/1",
  "meta.artifact_used_cdn": true,
  1. If you use a client (e.g. curl) within Google Cloud (or localhost), you'll notice meta.artifact_used_cdn is false because a CDN is not needed.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Stan Hu

Merge request reports