Skip to content

Maven virtual registry: Investigate the approach of storing files and digests as separate cache entries

Summary

Currently, the Maven virtual registry stores files with their digest information (MD5, SHA1) as columns in a single cache entry record. An alternative approach would be to store each file and its digests as separate cache entries.

Background

As discussed in !207411 (comment 2847382178), this approach could potentially:

  • Simplify the implementation by removing the need for background jobs
  • Remove the need for digest columns in the cache entry model
  • Keep cache entries organized consistently (each entry represents one file)

Proposed Changes

Instead of creating 1 cache entry per file with digest columns, create 3 cache entries:

  1. The main file (e.g., package-1.0.jar)
  2. The MD5 digest file (e.g., package-1.0.jar.md5)
  3. The SHA1 digest file (e.g., package-1.0.jar.sha1)

Benefits

  • Simplified logic: No special handling needed for digest-only requests
  • Consistent data model: All cache entries follow the same pattern
  • No background jobs needed: Digest requests can be handled directly

Considerations

  • Higher object storage usage: 3x more cache entries per file
  • Migration complexity: Need to handle existing cache entries
  • Potential race conditions: Multiple requests creating digest entries simultaneously
  • Performance impact: Need to evaluate the trade-offs

Related

Edited by 🤖 GitLab Bot 🤖