Add partition_id to job token

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Problem

As discussed in gitlab-com/gl-infra/production#20328 the groupdatabase frameworks identified a query triggering a CPU spike that needs to be fixed.

I think the first query on the job token is used to validate that the token is unique, before inserting it in the database. I think it could use partition pruning since we put the partition in the token to ensure uniqueness across partitions.

queryid: 5203370346919232571
SELECT
    "p_ci_builds"."status",
    "p_ci_builds"."finished_at",
    "p_ci_builds"."created_at",
    "p_ci_builds"."updated_at",
    "p_ci_builds"."started_at",
    "p_ci_builds"."coverage",
    "p_ci_builds"."name",
    "p_ci_builds"."options",
    "p_ci_builds"."allow_failure",
    "p_ci_builds"."stage_idx",
    "p_ci_builds"."tag",
    "p_ci_builds"."ref",
    "p_ci_builds"."type",
    "p_ci_builds"."target_url",
    "p_ci_builds"."description",
    "p_ci_builds"."erased_at",
    "p_ci_builds"."artifacts_expire_at",
    "p_ci_builds"."environment",
    "p_ci_builds"."when",
    "p_ci_builds"."yaml_variables",
    "p_ci_builds"."queued_at",
    "p_ci_builds"."lock_version",
    "p_ci_builds"."coverage_regex",
    "p_ci_builds"."retried",
    "p_ci_builds"."protected",
    "p_ci_builds"."failure_reason",
    "p_ci_builds"."scheduled_at",
    "p_ci_builds"."token_encrypted",
    "p_ci_builds"."resource_group_id",
    "p_ci_builds"."waiting_for_resource_at",
    "p_ci_builds"."processed",
    "p_ci_builds"."scheduling_type",
    "p_ci_builds"."id",
    "p_ci_builds"."stage_id",
    "p_ci_builds"."partition_id",
    "p_ci_builds"."auto_canceled_by_partition_id",
    "p_ci_builds"."auto_canceled_by_id",
    "p_ci_builds"."commit_id",
    "p_ci_builds"."erased_by_id",
    "p_ci_builds"."project_id",
    "p_ci_builds"."runner_id",
    "p_ci_builds"."upstream_pipeline_id",
    "p_ci_builds"."user_id",
    "p_ci_builds"."execution_config_id",
    "p_ci_builds"."upstream_pipeline_partition_id"
FROM
    "p_ci_builds"
WHERE
    "p_ci_builds"."type" = $1
    AND "p_ci_builds"."token_encrypted" IN ($2, $3)
LIMIT $4

Solution

  • Add partition pruning to avoid scanning all partitions

https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITION-PRUNING

Edited by 🤖 GitLab Bot 🤖