Investigate: How to add support for multiple upstreams
Investigation: Multiple Upstream Support for Maven Virtual Registry
Background
Currently, the Maven virtual registry MVC supports a single upstream repository. This investigation explores the implementation of multiple upstream support, considering performance, user experience, and potential feature refinements.
Investigation Areas
Performance Considerations
- Evaluate different strategies for checking artifact existence across multiple upstreams:
- Sequential vs parallel requests
- Potential for request timeouts and failure handling
- Caching strategies for upstream availability and artifact existence
- Impact on memory usage and request latency
- Connection pooling strategies for multiple upstreams
Implementation Approaches for Multi-upstream Resolution
User Interface and Configuration
- Design API endpoints for managing multiple upstreams
- Consider the following configuration options:
- Priority/order of upstreams
- Individual timeout settings
- Health check configurations
- Authentication settings per upstream
- Evaluate UX patterns for managing multiple upstreams in the UI (STRETCH)
4. Shared Upstreams Implementation
Following the pattern established by JFrog Artifactory and Sonatype Nexus, implement a shared upstream repositories feature:
Implementation Considerations
- Design database schema for shared upstream repositories
- Determine permission model for shared upstream access
- Define the scope of sharing (group-level vs instance-level)
- Plan migration path for existing single-upstream configurations
Key Implementation Areas
- Database schema changes to support shared repository references
- API endpoints for managing shared upstreams
- UI components for shared upstream management (STRETCH)
- Permission system integration
- Caching strategy for shared upstream metadata
Technical Considerations
Performance Metrics to Consider
- Response time for artifact resolution
- Memory usage during parallel requests
- Cache hit rates
- Error rates and timeout frequency
- Impact on GitLab instance resources
Technical Questions to Resolve
- What is the optimal balance between parallel requests and system resources?
- How should we handle timeouts and failures in a multi-upstream environment?
- Should shared upstreams be scoped to group level or instance level?
- How can we ensure efficient caching with shared upstreams?
- What's the most efficient way to handle permissions for shared upstreams?
- How should we handle updates to shared upstream configurations?
- What monitoring and observability features are needed?
Edited by Tim Rizzi