Workhorse should generate artifacts metadata directly on Object Storage

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

  • Close this issue

Description

Right now workhorse can upload artifacts directly into gitlab-ce~3612448 on user behalf. However metadata.gz is still generated and shared with unicorn using the filesystem.

If we don't change this then workhorse and unicorn must share the filesystem.

We should extend artifacts authorization API in order to retrieve a presigned URL for uploading metadata directly into gitlab-ce~3612448

Proposal

Workhorse has all the required parts already in place, we must provide the presigned API from rails and use them on workhorse side.

There are some complexities tho', in workhorse we avoid keeping a local copy of the artifact, we stream it directly to the object storage. However, only GCS allows streaming uploads and for S3 compatible implementation we had to relay on MultiPart Uploads.

MultiPart is complex and introduces a shared state between rails, that handle the session, and workhorse, that uploads every single parts using presigned URLs.

We use gitlab-zip-metadata to build metadata.gz and the file length is unknown upfront, we should consider generating this file on the local filesystem (not shared) and the upload it with a single pre-signed URL.

Links / references

  • https://gitlab.com/gitlab-org/gitlab-ee/issues/4183
  • gitlab-ee#4184
  • relevant workhorse v8.3.0 codebase

/cc @ayufan @andrewn

Edited Aug 29, 2025 by 🤖 GitLab Bot 🤖
Assignee Loading
Time tracking Loading