Add a setting for allowing/disallowing duplicate NuGet package upload
What does this MR do and why?
Context
When using the GitLab Package Registry to publish NuGet packages, a duplicate package name/version can be uploaded. This may be great for snapshots, but you may want your releases to be immutable.
This MR introduces a new setting that enables the user to define, at the group level, whether duplicate NuGet packages are allowed or not.
How Nuget upload works:
A NuGet package is a compressed file with the extension .nupkg or .snupkg (for symbols). When this file is pushed to GitLab, it contains metadata stored within a .nuspec file that is embedded in the compressed package. To retrieve the package name and version, the .nupkg file needs to be unzipped, and the relevant data extracted from the .nuspec file.
This unzipping process is handled by a background worker to ensure speedy publishing. As a result, users would receive an acknowledgment that the package was created, even though it was still being processed and published. Any errors that occurred during the background process are visible on the package registry UI page, allowing users to identify and rectify them.
Now, we aim to introduce a feature that allows users to prevent the publishing of duplicate packages. This feature should operate synchronously, meaning the client (NuGet, dotnet, Visual Studio) should receive a 409 status code (Conflict) if an attempt is made to publish a duplicate version. To achieve this, we need to handle the file unzipping synchronously, rather than using the background worker, as we cannot determine the package name and version until they are extracted from the .nuspec file. These extracted values are then used to check for duplicate packages. The package is considered a duplicate if its name & version match the name & version of a published package in the same project.
To summarize:
- We need to extract the
.nuspecfile from the package file synchronously in order to get the packagename&version. To achieve that efficiently, especially for large-size packages, we can handle theziparchive in a stream "mode"; meaning we don't download the whole.nupkgfile from the object store; alternatively we fire a streaming request and fetch the file in chunks. Each small fetched chunk can be unzipped and once the needed.nuspecfile is found, we extract it and stop streaming. The.nuspecfile is located at the top level of the archive so it should be fetched within the first two chunks (tested with different-sized packages). If we reached five downloaded chunks without finding the .nuspec file, we stop streaming and respond with an error:nuspec file not found. - Step
1.is executed only when the user disallows duplicate package uploads. If the setting is true (allowing duplicates), the entire publishing process is performed in the background worker, as before. - If duplicate package upload is disallowed and the
.nuspecfile is extracted as described in step1., we no longer repeat it during the subsequent publishing steps (provided the duplicate packages setting is disallowed). - This new setting does not affect symbol packages; they are handled as before. Symbols are attached to existing matching
.nupkgpackages. If no matching package exists, the symbols are not published. - The
--skip-duplicateoption should work out of the box, as we now respond with a409status code (Conflict) in the case of duplication. The client (NuGet cli, dotnet cli) can then proceed with the next package in the push, if any, ignoring those that failed to be published due to duplication.
Implementation Details
- Add two new columns
nuget_duplicates_allowed&nuget_duplicate_exception_regexto thenamespace_package_settingstable. The default is the current behavior which allows duplicates. - Make them updatable by GraphQL, but not added yet to the UI; this should be done in a separate MR for the next milestone.
- Introduce a new service
Packages::Nuget::FindOrCreatePackageServicewhich should check for duplication (if needed) then callExtractionWorkerto create the package and the package file. - Introduce a new service
Packages::Nuget::ExtractRemoteMetadataFileServicewhich is responsible for the zip streaming request of the package file. - Ensure we don't unzip the package file twice if we already checked for duplication.
How to set up and validate locally
-
Ensure you have the NuGet CLI installed (see nuget docs for links to installation pages).
-
Ensure the object store is enabled in your gdk.
-
In a new directory, run
nuget spec. A file namedPackage.nuspecshould be created. -
Run
nuget pack. A file namedPackage.nupkgshould be created. -
Add a GitLab project as your NuGet source:
nuget source Add -Name localhost -Source "http://gdk.test:3000/api/v4/projects/<project_id>/packages/nuget/index.json" -UserName <gitlab_username> -Password <personal_access_token> -
Push the package to your project:
nuget push Package.1.0.0.nupkg -Source localhost -
After the package is successfully published, clear the local NuGet cache
nuget locals all -clear -
Update the namespace package settings
nuget_duplicates_allowedusing the query below in graphql-explorer:
mutation {
updateNamespacePackageSettings(input: {
namespacePath: "<your-namespace-full-path>",
nugetDuplicatesAllowed:false,
}) {
packageSettings {
nugetDuplicatesAllowed
}
}
}
- Try to publish the same package again. You should see a 409 response from the server:
Pushing Package.1.0.0.nupkg to 'http://gdk.test:3000/api/v4/projects/<project_id>/packages/nuget'...
PUT http://gdk.test:3000/api/v4/projects/<project_id>/packages/nuget/
Conflict http://gdk.test:3000/api/v4/projects/<project_id>/packages/nuget/ 6367ms
To skip already published packages, use the option -SkipDuplicate
Response status code does not indicate success: 409 (Conflict).
- Update
nuget_duplicates_allowedto betrueand try to publish the same package. It should be successfully published.
Test the exception regex:
- Update the package settings as below. The regex ".-be." would allow only duplicate packages whose name or version matches the regex.
mutation {
updateNamespacePackageSettings(input: {
namespacePath: "<your-namespace-full-path>",
nugetDuplicatesAllowed:false,
nugetDuplicateExceptionRegex: ".*-be.*"
}) {
packageSettings {
nugetDuplicatesAllowed
nugetDuplicateExceptionRegex
}
}
}
- Edit the field in file
Package.nuspecfrom step 2. and make it2.0.0-betafor example then runnuget packand publish the generated.nupkgfile. - Publish the same package again. It should be published successfully because version
2.0.0-betamatches the regex.*-be.*.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #293748 (closed)