Support Readiness: 16.* - 18.* GitLab Runner New Token Architecture

What is happening

This issue is meant to coordinate efforts Next GitLab Runner Token Architecture (parent e... (gitlab-org&7663 - closed) between support and development teams.

Status / What actions have been taken so far

Timeline / Important Dates

  • new workflow became available side-by-side on %16.0 through a FF.
  • old workflow will be disabled through a top-level group setting in %17.0.
  • .com users can themselves re-enable it in top-level group settings (the UI isn't there yet, neither are docs), but it will only work until %18.0, when we'll disable the setting instance-wide.
  • self-managed users can re-enable it in the admin settings (no UI or docs either), it will then work until %18.0.
  • 2025-03-19 - decision made to defer removal of legacy registration token method due to continued high use of the deprecated method - no plans to set new removal date/version at this time.
  • 2025-03-19 - closing this readiness issue until such time as we have a new planned removal date/version.

Related Items

Glossary

Item Description
Runner The object created in GitLab with specific settings such as tags, protected status and "run untagged jobs". Zero to many Runner Managers belong to a Runner. Each runner has a unique Runner authentication token.
Runner Manager

The object created in GitLab when a gitlab-runner service (installed via package, docker image or Helm chart) is registered to a GitLab instance using a Runner authentication token. Identified via its system_id

Also refers to the instance of the the service itself.

What impact will this have on users?

The main impact is on customers who make use of automation to deploy runners, such as auto-scaling, Terraform and Helm. The legacy method only required the instance/group/project runner registration token to be copied from GitLab once and it could be used to provision multiple runner managers with different combinations of settings (e.g. tags) specified in the registration command.

It is this "one token to rule them all" approach that led to the development of the new architecture, which improves on security and audibility albeit at the loss of some flexibility and convenience. Runner managers with specific settings, and in particular tags, now need to be created as Runners in GitLab first, and the resulting authentication token used to register runner manager instances with that runner. Separate runners need to be created in GitLab for each combination of tag values, "run untagged jobs" and protected settings.

Automation of the initial step to create the Runner in GitLab and retrieve the authentication token has been facilitated via the creation of a new create_runner PAT scope, and an enhancement to the GitLab Terraform Provider to support the runner creation API.

The GitLab chart as at version 16.10.0 still deploys a single runner pod using the deprecated Runner registration token (stored in the secret gitlab-gitlab-runner-secret with value gitlab-registration-token). As things stand the runner manager pod will fail to start in version 17.0 until the legacy method is re-enabled via the UI post chart deployment or a Runner is created and the runner manager pod secret updated to use the appropriate runner-token value instead.

  • TBD - is there any intention to "fix" the default chart so the runner pod registers and starts up without any other changes?

The more significant impact is for separate deployments of the Runner Helm chart that currently rely on a static gitlab-registration-token along with differing tag values or other settings set in the chart config at time of registration. The is currently no way to deploy a new Runner using the chart without having first created a Runner (via UI or API) and obtained the config-specific authentication token.

There are several open epics/issues relating to how these may be addressed:

TBD - require guidance as to what options are currently the "best" to recommend to customers whose legacy workflows break.

What this may look like for Support

Anticipated Support Impact:

  • customers who have not made any changes to their runner provisioning workflows to use the new method will, upon installing version 17.0, be unable to register new runners in gitlab.com or self-managed.

What errors or messages users may report:

  • details of specific errors that will occur in v17.0 when

    • TBD: runner-registration-token is used in a chart deployment

    • gitlab-runner register --registration-token used from the command line:

      WARNING: Support for registration tokens and runner parameters in the 'register' command has been deprecated in GitLab Runner 15.6 and will be replaced with support for authentication tokens. For more information, see https://docs.gitlab.com/ee/ci/runners/new_creation_workflow
      ERROR: Registering runner... failed                 runner=vhLy93p9 status=POST http://gdk.test:3000/api/v4/runners: 410 Gone (410 Gone - runner registration disallowed)
      PANIC: Failed to register the runner.
    • TBD: the Terraform provider runner resource is used instead of the user runner resource

    This scenario can be tested today by disabling runner registration at the instance level from a Rails console:

    Gitlab::CurrentSettings.update!(allow_runner_registration_token: false)

What workarounds/solutions are available?:

The simplest workaround is to re-enable the legacy runner registration method by either:

However this only puts off the requirement to modify runner registrations workflows to version 18.0 when this option will be removed.

The permanent solution is for the customer to modify their processes to use the new method as described in Migrating to the new runner registration workflow and Tutorial: Automate runner creation and registration.

Do users need to be contacted?

  • gitlab.com users will be contacted via an outbound mail out (campaign issue)

Notes

  • the scenario that will happen in %17.0 can be tested today by disabling runner registration at the instance level from a Rails console:

    Gitlab::CurrentSettings.update!(allow_runner_registration_token: false)
  • details of runner managers (as seen in the UI) can only be retrieved via the GraphQL API by admin users - runners themselves can be managed by the Runners REST API:

    Runner Instance GraphQL Query
    {
      runners {
        nodes {
          id
          tagList
          managers {
            nodes {
              id
              systemId
              version
            }
          }
        }
      }
    }
  • there is currently no way to delete a single runner manager from a runner configuration: Add ability to delete a runner_manager from a group

  • there is currently no way to pause a single runner manager: Add ability to pause a runner_manager from a group

  • stale runner managers are automatically removed from their runner configuration after 7 days

DRIs/Contacts for questions and approvals for communications/action items

  • Slack Channel: #f_runner_fleet_management
  • Product or Development DRI: @DarrenEastman
  • Security DRI (if applicable):
  • Support DRI:
  • Support Manager DRI (if needed):

Support Resources

  • FAQ for Support:
  • Other resource:

User contact

  1. Categorize provided list by free/paid (if necessary)
  2. Message(s) to send to users created and approved by appropriate DRIs.
  3. Pull list of contacts using the runbook
  4. Send the message to the list of contact using the tickt generator form. Link to created issue:
  5. In the above issue, add a note at the top of the description or a comment that all tickets should be tagged with the tag ``.

Zendesk Macros

STM issue for new macro: #5953 (closed)

Zendesk tag: ``

  1. Macro MR:
    1. Macro adds appropriate tag
    2. Set DRIs as reviewers

Communication to Support team

  1. Announced to team in
    1. #support_gitlab-com or #support_self-managed or #support_team-chat
    2. SWIR

Edited by Justin Farmiloe