Remove unnecessary hash calculations from artifact uploads

During investigation of artifact upload performance bottlenecks, we identified that up to 30-35% of CPU time during artifact uploads is spent calculating hashes. Currently, Workhorse calculates 4 different hashes for each artifact, but only SHA256 is actually used by Rails for duplicate detection.

The other 3 hashes are not exposed publicly and their purpose is unclear. Removing unnecessary hash calculations could provide a modest performance improvement.

Next Steps:

  1. Identify which hashes are actually needed and used
  2. Remove unnecessary hash calculations from the artifact upload process
  3. Consider making hash calculations concurrent if multiple hashes are required

Related: Follows investigation in #527217 (closed)

Proposal

The hash functions that workhorse will use for a given upload are specified in workhorse_authorize with the UploadHashFunctions option. If none are specified (which is the default), all four will be generated. Currently the only time this changes is for FIPS mode, which doesn't use md5 hashes.

For the first iteration we can specify just sha256 for job artifact uploads (behind a feature flag). In later iterations we can investigate the same optimisation for other types of upload, and eventually have uploader classes specify only the hash functions required for that file type.

Edited Jan 15, 2026 by Tiger Watson
Assignee Loading
Time tracking Loading