Skip to content

Avoid copy operation during Terraform modules registry uploads

🔥 Problem

Similar to Avoid copying objects from one bucket to anothe... (#285597 - closed), the package registry can receive uploads for large files. The majority of the package registry uploads (most formats) will use a workhorse direct upload. In this mode, the file is put on object storage in a temporary location and when the upload is confirmed by the backend, the file is moved to its final location (using a copy operation).

The problem is that the GitLab instance can be connected to different object storage providers and that copy operation can take more or less time depending on the file size.

We need to avoid this copy operation at all. To do that, when the package file is uploaded to the Object Storage, we should instruct workhorse to consider the file's location as the final one. That means we won't need to move the file from its initial location. This initial location is its final location.

Implementation

In this MR, we are going to implement this on the Terraform Registry. We previously implemented it on the Generic Package Repository in Avoid copy operation in object store during Gen... (!147454 - merged)

What does this MR do?

  • Passing two new keyword arguments to ::Packages::PackageFileUploader.workhorse_authorize method.
    • use_final_store_path: true
    • final_store_path_root_id: <project_or_group_id>
  • Gating the changes behind a feature flag for gradual rollout.
  • Covering the changes with specs

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After

How to set up and validate locally

  1. Make sure Object Storage is enabled in your GDK.

  2. Open trace tool in MinIO web interface and click on Start button to trace the operations done in the Object Storage.

  3. From terminal, publish this dummy module module.tgz to the Terraform Registry:

    curl --location --header "PRIVATE-TOKEN: <PAT>" --upload-file module.tgz "http://gdk.test:3000/api/v4/projects/<project_id>/packages/terraform/modules/aws/test/0.0.1/file"
  4. In MinIO trace window, you should see something similar to this screenshot:

    Screenshot 2023-10-10 at 21.34.58.png 📓 Notice the 3 operations: PutObject, CopyObject & DeleteObject.

  5. In rails console, enable the skip_copy_operation_in_terraform_module_upload feature flag:

    ::Feature.enable(:skip_copy_operation_in_terraform_module_upload)
  6. Upload another dummy generic package while watching the trace window, you should see something similar to this screenshot:

    Screenshot 2023-10-10 at 21.35.57.png

    Now only one operation is done: PutObject 🚀

  7. You can test downloading the published package. Things should work normally.

Related to #439811 (closed)

Edited by Moaz Khalifa

Merge request reports