Add partition pruning to runner token queries

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Summary

The runner token verification query is one of the top queries on CI replicas, approaching 1000 calls/sec. We need to add partition pruning to reduce database load.

Background

From the parent issue discussion, the query used to find runners based on token is:

SELECT "ci_runners"."id", "ci_runners"."creator_id", "ci_runners"."created_at", "ci_runners"."updated_at", 
"ci_runners"."contacted_at", "ci_runners"."token_expires_at", "ci_runners"."public_projects_minutes_cost_factor", 
"ci_runners"."private_projects_minutes_cost_factor", "ci_runners"."access_level", "ci_runners"."maximum_timeout", 
"ci_runners"."runner_type", "ci_runners"."registration_type", "ci_runners"."creation_state", "ci_runners"."active", 
"ci_runners"."run_untagged", "ci_runners"."locked", "ci_runners"."name", "ci_runners"."token_encrypted", 
"ci_runners"."description", "ci_runners"."maintainer_note", "ci_runners"."allowed_plans", "ci_runners"."allowed_plan_ids", 
"ci_runners"."organization_id" 
FROM "ci_runners" 
WHERE (token_expires_at IS NULL OR token_expires_at >= NOW()) 
AND "ci_runners"."token_encrypted" IN ($1, $2) 
LIMIT $3

This query currently scans all 3 partition tables without pruning.

Proposed Solution

Based on the discussion, we have different classes of runner tokens:

  1. Runners registered with legacy registration token (registration_type: 0) - ~115k active machines
  2. Runners created with UI or new API (registration_type: 1) - ~361k active machines, which includes:
    • Tokens created pre-Cells (no encoded partition)
    • Tokens with partition encoded after prefix (e.g. glrt-t1-)
    • Tokens with partition encoded in payload (e.g. glrt-wjVatoYHgoFPJ8LKb1cgV2M6MQredactedprCnQ6Mwp1OjET.01.1c04bzhxe)

Implementation approach

  1. For registration_type: 1 runners with partition information in the token:

    • Decode the token to extract the partition type (t field in payload)
    • Use partition pruning to query only the relevant partition table
    • This covers >2/3 of queries based on active runner machines
  2. For legacy tokens without partition information:

    • Fall back to querying all partition tables (current behavior)

Token payload example

c:1
o:1
p:k
t:3  # runner type - can be used to determine partition
u:1
  • Parent issue: #578388
  • Related: #580390

Labels

typebug bugperformance Category:Database groupci platform devopsverify sectionci infradev database Deliverable

Edited by 🤖 GitLab Bot 🤖