Add /api/v4/token_exchange endpoint for modular services

Initial implementation of the token-exchange surface for modular services. Open follow-ups are listed at the bottom.

POST /api/v4/token_exchange issues a short-lived RS256 JWT that modular services (currently Artifact Registry) verify via OIDC discovery + JWKS against the issuing instance. The endpoint reuses CloudConnector::Keys for signing; everything else is purpose-built. SM<>SM works via the same code path.

Scope: this endpoint serves the 3rd-party client direct-access path (Docker, npm, Maven, etc.). The in-Rails resolver path -- where Rails fetches AR data on a user's behalf via a server-minted (principal, role, resource) scoped JWT -- is a separate server-side flow (handbook MR 19696).

Source: ee/lib/api/authn/token_exchange.rb (API::Authn::TokenExchange), minted by Authn::TokenExchange::TokenIssuer (no CloudConnector::* coupling beyond reading the signing key via CachingKeyLoader).

Decoded sample

{
  "jti": "5d250d2f-0e6c-4f7d-987b-222973bfb6af",
  "iss": "http://gdk.test:3443",
  "aud": ["gitlab-artifact-registry"],
  "sub": "1",
  "iat": 1779870540,
  "nbf": 1779870540,
  "exp": 1779870840,
  "gitlab_realm": "saas",
  "gitlab_organization_id": 1
}
  • iss -- Doorkeeper-derived instance URL. Verifier fetches JWKS from {iss}/.well-known/openid-configuration.
  • sub -- numeric local_id per ADR-019.
  • nbf == iat -- no sender-side leeway. The verifier must apply its own clock-skew tolerance (e.g. jwt.WithLeeway in golang-jwt v5, jwt.WithAcceptableSkew in lestrrat-go/jwx); ≥ 5s recommended.
  • exp = iat + ttl -- 5-min default, 12-hour cap (matches the issue spec and AWS CodeArtifact / Google Artifact Registry precedent so long Maven/Gradle builds don't expire mid-flight).
  • gitlab_realm -- saas / self-managed, via ::CloudConnector.gitlab_realm.
  • gitlab_organization_id -- read from current_user.organization_id.

We deliberately omit gitlab_instance_uid (iss already identifies the instance via JWKS; a self-asserted UUID adds no trust) and scopes (not needed by AR). JWT header is RS256 + kid from the same JWKS the verifier fetches.

Authorization

Two-layer split:

  1. Entitlement (Rails / this endpoint) -- can?(current_user, :access_artifact_registry_service). The Ability is owned by the AR team (work-item #599081); a TODO marker lives in the post block until it lands.
  2. Per-repo permission (AR) -- "can this user push/pull this specific repo?" AR's concern, via GLAZ + relationships API.

The companion CC catalog MR adds the artifact_registry backend + access_artifact_registry UP: gitlab-org/cloud-connector/gitlab-cloud-connector!241 (merged).

Expected entitlement wiring (AR team's responsibility)
# ee/app/policies/ee/global_policy.rb (or similar)
condition(:can_access_artifact_registry) do
  next false unless @user

  up = ::Gitlab::CloudConnector::DataModel::UnitPrimitive.find_by_name(:access_artifact_registry)
  next false unless up

  ::GitlabSubscriptions::AddOnPurchase.exists_for_unit_primitive?(up.name, @user)
end

rule { can_access_artifact_registry }.enable :access_artifact_registry_service

The CC catalog carries the UP -> add-ons mapping; the Ability resolves entitlement via the catalog and the user's active AddOnPurchase.

Accepted token types

Per the AR auth interface agreement R1:

token type header / pattern sub claim how
Legacy PAT PRIVATE-TOKEN: <pat> user id works by default
Granular PAT (FGT) PRIVATE-TOKEN: <pat> user id skip_granular_token_authorization: :modular_service_token_exchange (skip-reason added to skip_reasons.rb) -- so users don't have to opt into a new per-endpoint permission
Project/group access token PRIVATE-TOKEN: <pat> bot user id same auth path as legacy PAT (user has user_type=:project_bot/:group_bot)
OAuth bearer Authorization: Bearer <token> user id via Doorkeeper
CI job token JOB-TOKEN: <token> or job_token=<token> param build's user id route_setting :authentication, job_token_allowed: true + skip_job_token_policies: true (no project-scoped policy from ci_job_token_policies.json fits)

Deploy tokens are deliberately NOT accepted (return 401). DeployToken isn't a User, so our User-shaped payload can't apply cleanly. Follow-up: #601332.

Feature flag

Gated behind gate_token_exchange_endpoint (gitlab_com_derisk, default off).

Open questions

  • Client-passed expires_in -- default 300s, cap 12h. Long cap supports Maven/Gradle builds that publish near the end of a multi-hour run; pending appsec review.
  • TTL default of 5 minutes -- awaiting David Fernandez (Maven DRI) input.
  • Forward-compat with GATE IAM -- sub is a GitLab-internal numeric id. AR's verifier shouldn't bake in the integer-id assumption -- once GATE IAM is authoritative, a portable claim (e.g. gitlab_user_uuid) may show up alongside.
  • Non-human principals -- service accounts, project/group access token bot users, and CI job bot users surface as Users with numeric ids. AR sees the bot's id, not "user X on behalf of project Y build Z." Additional claims may be needed if AR's per-repo authz wants that context.
  • Audit + analytics -- Gitlab::InternalEvents.track_event for usage analytics deferred (TODO comment in the endpoint).
  • Deploy-token support -- see follow-up #601332.

Tested end-to-end

Verified locally against GDK for all five R1 token types (legacy PAT, granular PAT, OAuth bearer, project access token, CI job token via header or query param) plus error paths: unknown audience (400), unauthenticated (401), feature flag disabled (404), expires_in out of range (400), deploy token (401).

Reproduce locally

1. Enable the FF:

# bin/rails runner
Feature.enable(:gate_token_exchange_endpoint)

2. Mint tokens (`bin/rails runner -- one script):**

u = User.first
org = ::Organizations::Organization.first
project = Project.find(1)

# Legacy PAT
pat = "lpat-#{SecureRandom.hex(8)}"
PersonalAccessToken.create!(user: u, name: "e2e-legacy", scopes: ["api"],
  expires_at: 7.days.from_now,
  token_digest: Gitlab::CryptoHelper.sha256(pat))
puts "LEGACY_PAT: #{pat}"

# Granular PAT
gpat = "gpat-#{SecureRandom.hex(8)}"
PersonalAccessToken.create!(user: u, name: "e2e-granular", scopes: ["granular"],
  granular: true, expires_at: 7.days.from_now,
  token_digest: Gitlab::CryptoHelper.sha256(gpat))
puts "GRANULAR_PAT: #{gpat}"

# OAuth bearer (hashed storage -- grab plaintext)
app = ::Authn::OauthApplication.create!(name: "e2e-oauth",
  redirect_uri: "http://localhost", scopes: "api",
  confidential: false, organization: org)
t = Doorkeeper::AccessToken.create!(application: app,
  resource_owner_id: u.id, scopes: "api", organization: org)
puts "OAUTH_TOKEN: #{t.plaintext_token}"

# Project access token (project bot)
result = ::ResourceAccessTokens::CreateService.new(u, project,
  { name: "e2e-bot", scopes: ["api"], expires_at: 7.days.from_now }).execute
puts "PROJECT_BOT_PAT: #{result[:access_token].token}"

# CI job token (running build)
pipeline = Ci::Pipeline.create!(project: project, ref: project.default_branch,
  sha: project.repository.commit.sha, source: :web, user: u)
stage = Ci::Stage.find_or_create_by!(project: project, name: "build", pipeline: pipeline)
build = Ci::Build.create!(name: "e2e-job", project: project,
  pipeline: pipeline, ref: project.default_branch, user: u,
  status: :running, ci_stage: stage, scheduling_type: :stage)
build.send(:ensure_token!); build.reload
puts "JOB_TOKEN: #{build.token}"

# Deploy token (will be rejected)
dt = DeployToken.create!(name: "e2e-dt", username: "dt-#{Time.now.to_i}",
  read_registry: true, deploy_token_type: :project_type, project_id: project.id)
puts "DEPLOY_TOKEN: #{dt.token}"

3. Hit the endpoint:

URL="http://gdk.test:3443/api/v4/token_exchange"

curl -X POST "$URL" -H "PRIVATE-TOKEN: <LEGACY_PAT>"        -d "audience=gitlab-artifact-registry"  # 201
curl -X POST "$URL" -H "PRIVATE-TOKEN: <GRANULAR_PAT>"      -d "audience=gitlab-artifact-registry"  # 201
curl -X POST "$URL" -H "Authorization: Bearer <OAUTH>"      -d "audience=gitlab-artifact-registry"  # 201
curl -X POST "$URL" -H "PRIVATE-TOKEN: <PROJECT_BOT_PAT>"   -d "audience=gitlab-artifact-registry"  # 201 (sub = bot user id)
curl -X POST "$URL" -H "JOB-TOKEN: <JOB_TOKEN>"             -d "audience=gitlab-artifact-registry"  # 201
curl -X POST "$URL"                                          -d "audience=gitlab-artifact-registry" -d "job_token=<JOB_TOKEN>"  # 201
curl -X POST "$URL" -H "Deploy-Token: <DEPLOY_TOKEN>"        -d "audience=gitlab-artifact-registry"  # 401

4. Decode the JWT:

TOKEN='<copy from curl response>'
python3 -c "
import json, base64
seg = '$TOKEN'.split('.')[1]
seg += '=' * (-len(seg) % 4)
print(json.dumps(json.loads(base64.urlsafe_b64decode(seg)), indent=2))"

Gotchas:

  • OAuth tokens are stored hashed -- use t.plaintext_token, not t.token.
  • Feature.enable(:flag, actor) from bin/rails runner takes ~1-2s to propagate to rails-web; if FF seems sticky, sleep 2 or enable globally.
  • build.token is nil until build.send(:ensure_token!) runs.
Edited by Aleksei Lipniagov

Merge request reports

Loading