Investigate limits for the Package Registry
Topic to Evaluate
The GitLab Package Registry allows you to publish and install packages in a variety of formats to your GitLab instance. This issue proposes investigating the possibility and scope of adding limits to the GitLab Package Registry.
This investigation is part of the broader epic &1422 (closed), which aims to add limits for the Package stage.
Questions
(This includes all package manager formats)
What limits are currently in place for the GitLab Package Registry.File Size
Package Type | Max Size |
---|---|
NPM | None |
Maven | None |
Conan | 50 MB |
NuGet | 50 MB |
PyPI | 50 MB |
For Conan/NuGet/PyPI, these are controlled here and can be modified per package type in the authorize_workhorse!
call (hardcoded).
Maven could easily be modified by adjusting it to use authorize_workhorse!
, or by passing it in as a param in it's file upload endpoint (hardcoded).
NPM is the only package manager not using workhorse uploads (it uses carrierwave and accepts the file directly), there is no current limit.
Current File Name and Type exclusion
Package Type | File exclusions |
---|---|
NPM | None. I believe we can restrict to only .tgz files. File names are validated with the standard package name regex. |
Maven | None. We should be able to restrict to .xml, .pom, .jar files. File names are validated with a Maven specific regex. |
Conan | Yes. Only specific file names and types are allowed defined in ConanFileMetadatum . This has proven to be an issue because some files that should be allowed were not known during implementation, preventing some use cases. There is a list of files in the conan source code here, but it is unclear which are the allowed upload files and which are other files, so some investigation will be needed to update. |
NuGet | Yes. Only .nupkg files are allowed. File names are validated with the standard package name regex |
PyPI | None. We should be able to restrict to tar.gz and .whl files. File names are validated with the standard package name regex |
Current registry storage limits
There is currently no limit (that I can tell) to the amount of storage that can be used for files in the package registry at the project, group, or instance levels.- Are there sensible, non-controversial limits that can be easily added
Are there specific file types that we should exclude?
Yes, for each package manager, we should only allow the specific filetypes that package manager uses.
Can we consider adding storage limits in the future? Can they be enforced?
We can add limits likely at the project, group, or instance level. We can limit the total size, or total number of packages.
Total number
This would be the easiest route, we could add settings at any of the above levels that could be configured by admin users. Enforcing would be as easy as checking how many packages exist within a given project/group/instance when someone tries to push a new package and rejecting if the limit has been met. Each level would be it's own issue here, each likely having a weight of 3, needing both frontend and backend work.
Total size
Similar to number, we could add settings at the above levels for admin users to configure. Enforcing this is a bit more difficult. I'm not sure how we could quickly go about calculating the amount of space taken up by packages. I'm thinking perhaps it could be saved in the database (in a project or group supplemental statistics table) and updated anytime someone pushes or deletes a package. Doable, but likely a larger issue. Each level would be it's own issue here, they would all be somewhat large, >5, and need to be broken down into a few steps to be implemented.
How could we implement the above?
See the proposed issues and the storage limits comments.
Proposed Issues
I've added a rough weight estimate to each
-
1
Restrict NPM packages to.tgz
file type -
1
Restrict Maven packages to.xml
,.pom
. and.jar
files. -
1
Restrict PyPI packages to.tar.gz
and.whl
files. -
2
Consider tighter naming restrictions for Maven, NPM, NuGet, and PyPI based on any restriction within their own package manager rules -
1
Limit Maven package files to 50MB -
2
Limit NPM package files to 50MB
Tasks
-
Create issue for implementation or update existing implementation issue description with implementation proposal -
Set weight on implementation issue -
If weight is greater than 5, break issue into smaller issues
Risks and Implementation Considerations
Setting a limit, specifically on size is a large undertaking. With the proposed idea of storing a constantly updating value in the database, the work might look like:
For Project level:
-
1
Database migration to add storage columns to a project_statistics table -
2-3
Background migration to determine storage for existing projects and update the table. -
1-2
(x5) Update package manager code to update the stats every time a package is uploaded/deleted. -
1
Database migration to add limit settings to project table. -
1
Frontend migration for limit settings -
1
Frontend migration for displaying current storage usage or amount remaining -
2
Add logic to reject package uploads when a package is uploaded after the limit is met.
The breakdown would be similar for group or instance level.
Resulting Issues:
Size limits
- Maven file size limit
- Conan file size limit
- NPM file size limit
- PyPI file size limit
- NuGet file size limit
Note I've set these up to have Maven implemented first, adding the database migration, so it has a weight of 2 and the others all have a weight of 1. Any can be worked on first and have the weight and database MR swapped with Maven.
File type restriction
File name restriction
Storage limits
Note the storage limits issue is large and should be broken down to fit pieces within a given milestone.