Query Performance Investigation in BackfillWorkItems migration
Description
The below mentioned query is not performant and is increasing the load on CPU.
Steps
-
Rename the issue to - Query Performance Investigation - [Query Snippet | Table info]
- For example -
Query Performance Investigation - SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."id" = $1 LIMIT $2
- For example -
-
Provide information in the Requested Data Points table -
Provide priority and severity labels -
If this requires immediate attention cc @gitlab-org/database-team
and reach out in the #g_database slack channel
SQL Statement
/*application:sidekiq,correlation_id:01J65T9QB419F1AW7SZNZRXGPD,jid:b3a21a544b17359ab6b42da1,endpoint_id:Search::NamespaceIndexIntegrityWorker,db_config_name:main_replica*/
WITH "descendants_base_cte" AS MATERIALIZED (
SELECT
"namespaces"."id",
"namespaces"."traversal_ids"
FROM
"namespaces"
WHERE
"namespaces"."id" IN (
SELECT
"elasticsearch_indexed_namespaces"."namespace_id"
FROM
"elasticsearch_indexed_namespaces"
)
),
"superset" AS (
SELECT
d1.traversal_ids
FROM
descendants_base_cte d1
WHERE
NOT EXISTS (
SELECT
$1
FROM
descendants_base_cte d2
WHERE
d2.id = ANY(d1.traversal_ids)
AND d2.id <> d1.id
)
)
SELECT
$2 AS one
FROM
"superset",
"namespaces"
WHERE
next_traversal_ids_sibling("superset"."traversal_ids") > "namespaces"."traversal_ids"
AND "superset"."traversal_ids" <= "namespaces"."traversal_ids"
AND "namespaces"."id" = $3
LIMIT
$4
Data from Elastic
Instructions on collecting data from PostgreSQL slow logs stored in Elasticsearch
Requested Data points
Please provide as many of these fields as possible when submitting a query performance report.
- Queries per second (on average or peak)
- Number of calls per second and relative to total number of calls
- Query timings (on average or peak)
- Database time relative to total database time
- Source of calls (Sidekiq, WebAPI, etc)
- Query ID
- Query Plan
- Query Example
- Total number of calls (relative)
- % of Total time