Return 404 for legacy repository archive URL
What does this MR do and why?
Contributes to #595699
Problem
npm's hosted-git-info library generates tarball URLs in the
legacy format /{namespace}/{project}/repository/archive.tar.gz.
This URL does not match any Rails route and falls through to the
catch-all *unmatched_route handler, which redirects
unauthenticated users to /users/sign_in (302 → 200 HTML).
npm then tries to parse the HTML sign-in page as a tarball,
failing with TAR_BAD_ARCHIVE.
Solution
Add a dedicated route for the legacy repository/archive URL
that maps to a new Projects::LegacyArchiveController#not_found
action, which always returns 404. This prevents the confusing
redirect-to-sign-in behavior and gives automated tools a proper
HTTP error code. The controller inherits from
::ApplicationController (not Projects::ApplicationController)
to avoid the before_action :project chain that triggers
find_routable! and the redirect.
References
- #595699
- npm upstream issues: https://github.com/npm/cli/issues/3229, https://github.com/npm/pacote/issues/476
- npm's
hosted-git-infotarball URL template for GitLab
Screenshots or screen recordings
| Before | After |
|---|---|
curl -sI .../repository/archive.tar.gz → 302 redirect to /users/sign_in → 200 HTML |
curl -sI .../repository/archive.tar.gz → 404 Not Found |
How to set up and validate locally
- Create or use any private project (e.g.
root/my-project) - Before this MR:
curl -sI "http://gdk.test:3000/root/my-project/repository/archive.tar.gz?ref=main" # HTTP/1.1 302 Found — redirects to /users/sign_in - Enable FF:
Feature.enable(:legacy_archive_not_found) - After this MR:
curl -sI "http://gdk.test:3000/root/my-project/repository/archive.tar.gz?ref=main" # HTTP/1.1 404 Not Found - Verify the correct
/-/archive/URL still works for public projects:curl -sI "http://gdk.test:3000/root/my-project/-/archive/main/my-project-main.tar.gz" # HTTP/1.1 200 OK (for public projects)
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.