Add CI job to sync AI principles from SSOT

What does this MR do and why?

Wire the gitlab-ai-principles-distiller gem (added in !235272 (merged)) to a weekly scheduled pipeline that keeps distilled agent principles in .ai/principles/distilled/*.md in sync with their docs.gitlab.com sources. When the script detects drift, it auto-creates a branch, commits the regenerated principles, and opens an MR labelled ai-agent, documentation, and type::maintenance for human review. Auto-MR target settings (branch prefix, title template, labels, remove_source_branch) live in .ai/principles/manifest.yml under the auto_mr: block.

This delivers issue https://gitlab.com/gitlab-org/gitlab/-/issues/597600. The gem and AI Catalog flow it depends on are tracked in https://gitlab.com/gitlab-org/gitlab/-/issues/599663.

Surface

  • .gitlab/ci/sync-principles.gitlab-ci.yml — new file. Adds ai-principles-sync in the ai-gateway stage with needs: [], timeout: 60m, allow_failure: true, and an artifacts block that uploads .ai/principles/distilled/ for 7 days on every run (when: always). Runs on ruby:${RUBY_VERSION}-alpine via the dependency proxy, with BUNDLE_PATH: vendor so gem deps are vendored under the gem dir. The before_script does more than a bare bundle install:

    • installs build-base git via apk,
    • logs the calling identity ($GITLAB_USER_LOGIN) and $CI_PIPELINE_SOURCE for audit-trail visibility,
    • sets the service account git author (Agent Principles Distiller),
    • does a shallow git fetch of $CI_DEFAULT_BRANCH so Gitlab::PrinciplesDistiller::Sync's git merge-base HEAD master (the distillation_base_sha) can resolve on the default depth-20 clone,
    • pre-installs the exact Bundler version pinned in Gemfile.lock to avoid the alpine image's older Bundler self-upgrading mid-install (which prints a confusing "Cannot write a changed lockfile while frozen" stderr line).

    The script then runs, from gems/gitlab-ai-principles-distiller/:

    • bundle exec bin/gitlab-ai-principles-distiller-provision-flow --workspace "$CI_PROJECT_DIR" (mirrors any prompt/tool changes from .ai/principles/distillation_prompt.md into the AI Catalog flow), then
    • bundle exec bin/gitlab-ai-principles-distiller-sync --workspace "$CI_PROJECT_DIR" --push (detects drift, triggers Duo Workflows, writes back, and opens the auto-MR).

    Authenticates via the AGENT_PRINCIPLES_SERVICE_ACCOUNT_TOKEN project CI variable, surfaced as both GITLAB_TOKEN (Workflow API + GraphQL) and GITLAB_API_TOKEN (auto-MR REST).

  • .gitlab/ci/rules.gitlab-ci.yml.ai-principles-sync:rules:weekly reuses &if-default-branch-schedule-weekly, pinned to gitlab-org/gitlab and gated on the AGENT_PRINCIPLES_SERVICE_ACCOUNT_TOKEN variable being present. The job triggers only on the weekly default-branch schedule.

  • .ai/principles/manifest.yml — adds a new authentication principle under the Security group, covering authentication, authorization, and composite identity for Duo agents. File filters target controller, service, and auth library paths in both FOSS and EE (app/controllers/**/*.rb, app/services/**/*.rb, lib/gitlab/auth/**/*.rb, lib/api/helpers/**/*.rb, and their ee/ counterparts). Sources: doc/development/authentication.md and doc/development/ai_features/composite_identity.md. This will be picked up by the first run of the sync job to produce a new .ai/principles/distilled/authentication.md.

Git push authentication

The auto-MR git push authenticates the service account PAT via an http.<host>.extraHeader injected through GIT_CONFIG_* env vars, so the token never lands in argv, the remote URL, or git's reflog. Two subtleties (each previously caused a failed push, caught during end-to-end validation):

  • Header scope must be the host (http.https://gitlab.com.extraHeader), not the repo URL. git matches http.<url>.* by URL prefix on whole path segments, so a key scoped to .../gitlab does not match the request to .../gitlab.git and would be dropped (push falls back to anonymous → 403).
  • Auth scheme must be HTTP Basic (Authorization: Basic base64("oauth2:<token>")), not Bearer: GitLab's smart-HTTP git endpoint authenticates PATs via Basic auth with a non-empty username. The REST calls (find/create/update MR) correctly use PRIVATE-TOKEN.

Required CI variables

Variable Purpose
AGENT_PRINCIPLES_SERVICE_ACCOUNT_TOKEN Classic PAT (api scope; fine-grained PATs are not supported because they do not cover GraphQL, AI Catalog mutations, or the Duo Workflow create endpoint) used as both GITLAB_TOKEN (Workflow API + GraphQL) and GITLAB_API_TOKEN (auto-MR REST). Belongs to the dedicated service account (see Authentication). Set protected + masked.
AGENT_PRINCIPLES_CATALOG_ITEM_CONSUMER_ID Numeric ID of the ItemConsumer that binds the catalog Flow to gitlab-org/gitlab. Provisioned once via bin/gitlab-ai-principles-distiller-provision-flow and printed at the end of that script's output.

AGENT_PRINCIPLES_CATALOG_PROJECT (the catalog project path) is set inline in the YAML (gitlab-org/gitlab) rather than as a project CI variable, so the binding is reviewable in the YAML diff.

The catalog Flow has been provisioned in production (Flow ID gid://gitlab/Ai::Catalog::Item/1009160, Consumer ID 7368818).

Authentication

The Duo Agent Platform Workflow API requires the calling identity to have a Duo Agent Platform seat. Project access tokens (such as PROJECT_TOKEN_FOR_CI_SCRIPTS_API_USAGE) are bound to bot users that do not hold seats, so they cannot drive this job — confirmed by an earlier CI run that returned 400 Bad request - ["forbidden to access duo workflow"].

This is now served by a dedicated service account, service-modelops-agent-principles-distiller, provisioned via access request https://gitlab.com/gitlab-com/team-member-epics/access-requests/-/issues/43931 (confidential). The account holds Developer on gitlab-org/gitlab and a Duo Agent Platform seat. Its api-scope PAT is stored in AGENT_PRINCIPLES_SERVICE_ACCOUNT_TOKEN (protected + masked, expires 2027-06-05). End-to-end validation against this account is green: the job distilled all 7 principles via the Workflow API and pushed a branch / opened an MR.

allow_failure: true is intentional: a failed scheduled run (e.g. revoked PAT, transient Duo Workflow error) should surface in the schedule UI without blocking other scheduled work.

End-to-end testing

The production rules (schedule source + weekly type + default branch) make the job impossible to run directly from an MR pipeline. End-to-end validation was performed by temporarily exposing the job to this MR's pipeline (a merge_request_event-gated manual rule, since removed) and playing it. The validating run distilled all 7 principles via the Workflow API on the service-account seat, pushed the branch, and created an auto-MR — confirming push auth, REST MR creation, and the full flow.

This MR is based on current master so the merged-result pipeline (refs/merge-requests/235014/merge) contains the gem files (the gem landed in !235272 (merged)).

Pre-merge cleanup

  • Remove the temporary merge_request_event rule branch in .gitlab/ci/rules.gitlab-ci.yml.
  • Flip AGENT_PRINCIPLES_SERVICE_ACCOUNT_TOKEN to protected: true in Settings → CI/CD → Variables. With the temp rule gone, the only consumer is the scheduled pipeline on master (a protected branch).
  • Set an expiry on the service-account PAT (2027-06-05).
  • Add an identity-logging line to the before_script so each job run prints the calling identity. Landed via echo "Identity:${GITLAB_USER_LOGIN:-<unset>} (job triggered from $CI_PIPELINE_SOURCE)".
  • Confirm the gem MR (!235272 (merged)) has merged and this MR targets master.
  • Un-Draft this MR.

Post-merge follow-up

The job rides the existing weekly pipeline schedule — no new schedule is required. Creating a second master + SCHEDULE_TYPE=weekly schedule would double-run every weekly job.

  1. The job is triggered by the existing schedule 2835726 ([Weekly] Elasticsearch 9, OpenSearch latest, Valkey, PG17 testing; master; 0 10 * * 2 Tuesdays 10:00 UTC; SCHEDULE_TYPE=weekly; owner gitlab-bot). After merge, trigger that schedule once (or wait for the next run) and confirm ai-principles-sync appears, authenticates as the service account (check the Identity: log line), and is not skipped by the $AGENT_PRINCIPLES_SERVICE_ACCOUNT_TOKEN guard.
  2. The schedule owner (gitlab-bot) has Maintainer (access_level 40) on gitlab-org/gitlab, so it can read the protected AGENT_PRINCIPLES_SERVICE_ACCOUNT_TOKEN. If the job is ever skipped, schedule-owner ↔️ protected-variable access is the first thing to check.
  3. Revisit allow_failure: true on the job — now that the service-account path is validated, a schedule failure could be made blocking.

Checklist

Pre-merge

Consider the effect of the changes in this merge request on the following:

If new jobs are added:

  • Change-related rules: pipeline-schedule-only (SCHEDULE_TYPE=weekly), no MR/main-branch impact.
  • Frequency: weekly, against master only.
  • N/A: not added to merge request pipelines (schedule-only).

Post-merge

Edited by Pedro Pombeiro

Merge request reports

Loading