Add Cargo sparse index endpoint
What does this MR do and why?
Adds GET /api/v4/projects/:id/packages/cargo/{prefix}/{name} — the Cargo registry sparse index. Returns newline-delimited JSON, one line per published version of a crate, most recently published first. This is what the Cargo CLI polls during cargo install and after cargo publish to resolve dependencies.
This is MR 2 of the Cargo MVC plan laid out in this issue comment. MR 1 (the download endpoint, !236631 (merged)) merged earlier this week. The remaining work is the upload authorize / upload publish endpoints + GA hardening.
What's in this MR
- Four explicit routes for the four prefix shapes from the Cargo registry index spec —
1/{name},2/{name},3/{first}/{name},{first-two}/{next-two}/{name}— declared inside the existing:id/packages/cargonamespace. Packages::Cargo::MetadataFinder— collects installablePackages::Cargo::Metadatumrows for a project + normalized name, ordered bypackage_id DESC(most recently published first), capped at 500 versions. No pagination; the Cargo CLI doesn't support it.- New
:read_cargo_packagegranular permission (mirrors:read_ruby_gem) plus its entry in thepackages_and_registry/packageassignable bundle so granular tokens scoped toread_packagecan use the new endpoint. - Everything stays behind the existing
package_registry_cargo_supportWIP feature flag (default off).
Design notes worth flagging
- Route-ordering subtlety. The 4+ char route
:prefix_1/:prefix_2/:package_namecollides path-shape-wise with the existing:package_name/:package_version/download. I pinned each prefix segment to exactly two normalized-name characters (/[a-z0-9-]{2}/) so the index route can't shadow the download route — a download URL likemy-crate/1.0.0/downloadwon't match the index route sincemy-crateis 8 chars, while a real index URL likedo/wn/download(sparse index for a crate literally named "download") matches the index route first as intended. - Response framing. The class sets
default_format :json, which would JSON-encode any returned string. I addedenv['api.format'] = :binaryalongsidecontent_type 'text/plain'to pass the NDJSON body through verbatim. Same pattern the Debian distribution endpoint uses. - Ordering choice.
package_id DESCrather than semver. The Cargo CLI doesn't care about order andpackage_id DESC≈ reverse publish order, which keeps the most recently published versions when the 500 cap applies (matching thelimit_recentconvention used by Conan/NuGet/Helm). Publishing1.5.0after2.0.0yields[1.5.0, 2.0.0]— this is publish order, not semantic-version order.
References
- Feature issue: #33060
- MVC plan: see the comment thread on the issue (proposed by @trizzi, reviewed by @kiran-4444 and @10io)
- Prior MR (download endpoint): !236631 (merged)
- Feature flag rollout issue: #525330
- Cargo registry index spec: https://doc.rust-lang.org/cargo/reference/registry-index.html
Database
The sparse index finder returns one crate's versions within a project, ordered by publish order (package_id DESC — most recently published first) and capped at 500. Note this is publish order, not semantic-version order: publishing 1.5.0 after 2.0.0 yields [1.5.0, 2.0.0]. Ordering is descending so that when the 500 cap applies, the most recently published versions are kept.
Query:
SELECT packages_cargo_metadata.*
FROM packages_cargo_metadata
INNER JOIN packages_packages
ON packages_packages.id = packages_cargo_metadata.package_id
WHERE packages_cargo_metadata.project_id = $1
AND packages_cargo_metadata.normalized_name = $2
AND packages_packages.package_type = 15 -- cargo
AND packages_packages.status IN (0, 1, 5) -- default, hidden, deprecated
ORDER BY packages_cargo_metadata.package_id DESC
LIMIT 500;Plan (seeded with 600 versions for the target crate plus ~50k rows of noise across 5k crates): https://postgres.ai/console/gitlab/gitlab-production-main/sessions/52138/commands/153585
Plan
Limit (cost=2447.33..2447.33 rows=1 width=57) (actual time=2.020..2.071 rows=500 loops=1)
Buffers: shared hit=3025
-> Sort (cost=2446.74..2446.74 rows=1 width=57) (actual time=2.018..2.038 rows=500 loops=1)
Sort Key: packages_cargo_metadata.package_id DESC
Sort Method: quicksort Memory: 104kB
Buffers: shared hit=3025
-> Nested Loop (cost=0.98..2446.73 rows=1 width=57) (actual time=0.143..1.770 rows=600 loops=1)
Buffers: shared hit=3022
-> Index Scan using index_cargo_metadata_on_project_normalized_name_version on packages_cargo_metadata (cost=0.41..294.98 rows=600 width=57) (actual time=0.118..0.248 rows=600 loops=1)
Index Cond: ((project_id = $1) AND (normalized_name = 'my-crate'::text))
Buffers: shared hit=22
-> Index Scan using packages_packages_pkey on packages_packages (cost=0.56..3.59 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=600)
Index Cond: (id = packages_cargo_metadata.package_id)
Filter: ((package_type = 15) AND (status = ANY ('{0,1,5}'::integer[])))
Rows Removed by Filter: 0
Buffers: shared hit=3000
Execution Time: 2.226 msThe ORDER BY adds a sort node, but it operates only on the rows matching (project_id, normalized_name) — one crate's versions, capped at 500 — not the whole table. With ~50k rows seeded the index still narrows to the 600 matching rows before the sort (quicksort, Memory: 104kB), so the sort input is bounded by versions-per-crate and does not grow with table size. Execution is ~2 ms with no disk reads.
How to set up and validate locally
-
Enable the feature flag for a test project:
Feature.enable(:package_registry_cargo_support, Project.find(<id>)) -
Seed a couple of versions of a crate via console or factories (there's no upload endpoint yet in MR 2):
project = Project.find(<id>) pkg1 = create(:cargo_package, name: 'my-crate', version: '1.0.0', project: project) pkg2 = create(:cargo_package, name: 'my-crate', version: '2.0.0', project: project) create(:cargo_metadatum, package: pkg1) create(:cargo_metadatum, package: pkg2) -
Hit the sparse index for the 4+ char prefix shape:
curl -H 'Authorization: Bearer <PAT>' \ http://gdk.test:3000/api/v4/projects/<id>/packages/cargo/my/-c/my-crateExpected:
Content-Type: text/plain, body is two NDJSON lines (most recently published first), each parsing to an object matching thecargo_package_index_contentschema. -
Smoke-check the other prefix shapes by seeding crates named
a,ab,abcand hitting1/a,2/ab,3/a/abc. -
Verify 404 for an unknown crate name; verify FF off → 404; verify private project + token without project access → 404.
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist.