Add support for NPM package metadata
🍉 Context
The NPM package registry implements a bunch of API endpoints expected by $ npm
(or $ yarn
). Among those endpoints, we have what we call the metadata endpoint.
Basically, clients will check in with the Package Registry to answer this question: given a package name, what versions are available? The package registry will answer a json structure (similar to the package.json
but for all versions) with the required information.
Within each version structure, clients expect to have a few fields. Here are some examples:
-
dist
: describes where the*.tgz
archive file for this version is located. -
dependencies
(and similar fields): describes dependencies of this version to other packages.
In #330929 (closed), it has been noted that the metadata endpoint didn't return all the necessary fields. Among those missing, there is the important bin
field. It is used to insert executables in the current $PATH
.
❗ Solution
Fortunately, $ npm
is kind enough to send the metadata structure along with the package file when a new version of a package is uploaded.
From there, we can save in the database that metadata with the given version.
Then, on the metadata endpoint, we can simply load the metadata object and read the fields.
The NPM package registry being one of the most used registries on gitlab.com, we can't simply copy the metadata structure in the metadata endpoint response and that's it. The metadata endpoint returns all the data for all the versions of a given package. There is no pagination options which means that if a package has 1K versions, the metadata will need to return all the data about those 1K versions.
On the other hand, the metadata of an NPM package has some fields clearly defined but users can put an arbitrary amount of custom fields in the metadata. An example of this, is the ng-update
that angular packages have in their package.json
.
In the spirit of iteration, the metadata endpoint will for now return the abbreviated form which means that only a strict set of fields are read by the metadata endpoint on each version and returned.
We already opened a follow up issue to support the full metadata form.
🤔 What does this MR do and why?
- Add a
packages_npm_metadata
table with:-
package_id
the package id that this metadata belongs to. -
package_json
ajsonb
field that will store the metadata structure.
-
- Update
app/services/packages/npm/create_package_service.rb
so that the metadata is persisted in the new table - Update
app/presenters/packages/npm/package_presenter.rb
so that the metadata is loaded and read from the new table.- The read here is limited to the allowed fields (those from the abbreviated form)
- This change is compatible with the existing packages (which will not have anything in the metadata table) = for these, this change will not impact how the metadata endpoint return them.
- Update the related documentation
Because the [NPM package registry] is one of the most used registries on gitlab.com and the metadata endpoint is a central piece of logic behind the $ npm install
logic, we will use a feature flag as a safety net.
Rollout issue: #344827 (closed)
🖼 Screenshots or screen recordings
Setup:
- I used a package that defines metadata fields
bin
andengines
. - I uploaded the first versions of an NPM package using
master
. Then, I uploaded a few versions with this MR branch and the feature flag enabled.- The first versions simulate existing packages with no metadata
- The most recent versions simulate package uploads with the metadata support
feature flag disabled
With theLet's see the output of the metadata endpoint:
- We can see that all the versions metadata is similar (same fields).
- For the most recent versions, the
bin
andengines
fields are not returned.
feature flag enabled
With the- Most recent versions, the
bin
andengines
fields are properly returned✅
⚙ How to set up and validate locally
Requirements:
- A working local GitLab instance
- A project (any visibility)
- A personal access token with the
api
scope - npm
- Use
$ npm init
to initialize an npm package (any name) - Update the
package.json
to include some extra fields.- Example
Click to expand
```json { "name": "@many/npm_metadata", "version": "1.3.16", "description": "Package created by gl pru", "main": "index.js", "scripts": { "preinstall": "echo \"PREINSTALL script!\"", "install": "echo \"INSTALL script!\"", "postinstall": "echo \"POSTINSTALL script!\"" }, "keywords": [], "author": "GitLab Package Registry Utility", "license": "ISC", "publishConfig": { "registry":"http://gdk.test:8000/api/v4/projects/166/packages/npm/" }, "engines": { "node": "^12.14.1 || >=14.0.0", "npm": "^6.11.0 || ^7.5.6 || >=8.0.0", "yarn": ">= 1.13.0" } } ```
- Example
- Setup the credentials for the npm package registry following https://docs.gitlab.com/ee/user/packages/npm_registry/#project-level-npm-endpoint
- Push a few versions with
$ npm publish
- To bump the version, just update the
version
field in thepackage.json
file
- To bump the version, just update the
- Enabled the feature flag:
Feature.enable(:packages_npm_abbreviated_metadata)
- Push a few more versions
Now check the metadata endpoint and its output depending on the feature flag state.
- Go to
<gitlab_base_url>/api/v4/projects/<project_id>/packages/npm/<package_name_including_scope>
- Enable/disable the feature flag and see the effects of it in the output of the metadata endpoint
☑ MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
💾 Database review
Enabling the feature flag will add a preload of all metadata. This will basically load a set of rows from packages_npm_metadata
given a set of package ids.
- Metadata preload: !73639 (comment 722469784)
⤴ Migration up
$ rails db:migrate
== 20211028132247 CreatePackagesNpmMetadata: migrating ========================
-- transaction_open?()
-> 0.0000s
-- create_table(:packages_npm_metadata, {:id=>false})
-> 0.0273s
== 20211028132247 CreatePackagesNpmMetadata: migrated (0.0503s) ===============
⤵ Migration down
$ rails db:rollback
== 20211028132247 CreatePackagesNpmMetadata: reverting ========================
-- transaction_open?()
-> 0.0000s
-- drop_table(:packages_npm_metadata)
-> 0.0103s
== 20211028132247 CreatePackagesNpmMetadata: reverted (0.0441s) ===============