Skip to content

Analyze the database spike of 70% of CPU utilization

In this issue we want to investigate what are the factors that are triggering the spike that generated a spike that our primary database CPU utilization reaches 70%.

Here the CPU graph: Screenshot_2020-12-03_at_00.49.35

Now I started with the analysis of the statements that were most executed during this period of time:

First the query of the 5 queries most executed :

topk(5, sum by (queryid) (rate(pg_stat_statements_calls{environment="gprd", tier="db", type="patroni"}[1m]) and ON (instance) (pg_replication_is_replica == 0)))

Screenshot_2020-12-03_at_00.51.20

Now the top 10 queries with a granularity of 1m: topk(10, sum by (queryid) (rate(pg_stat_statements_calls{environment="gprd", tier="db", type="patroni"}[1m]) and ON (instance) (pg_replication_is_replica == 0)))

Screenshot_2020-12-03_at_00.52.26

Here the list of the queryIds that we need to investigate:

Screenshot_2020-12-03_at_00.52.35

Table of the statements that are the PEAK on prod :

QueryID QUERY
833913155023572892 SELECT 1.
-5301438116714883807 SET application_name='sidekiq 5.2.9 queues:pipeline_..._rese
-1600097631520268746 SET application_name='sidekiq 5.2.9 queues:pipeline_...ate_h
-7568105151578171542 SET application_name='sidekiq 5.2.9 queues:authorize...expor
73367110635711796 SELECT "projects".* FROM "projects" WHERE "projects"."id" =
6769309683899657633 SELECT "routes".* FROM "routes" WHERE "routes"."source_id" =
6749620766035719574 SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."
6504150523421693673 SELECT "taggings".* FROM "taggings" WHERE "taggings"."taggab
6974950735891200787 SELECT "tags".* FROM "tags" INNER JOIN "taggings" ON "tags".
-2372450153195223637 SELECT $1 AS one FROM ((SELECT "ci_runners".* FROM "ci_runne

2 statements are having an increase in the spike :

-7568105151578171542 : SET application_name='sidekiq 5.2.9 queues:authorize...expor Screenshot_2020-12-03_at_01.14.18

-5301438116714883807 : SET application_name='sidekiq 5.2.9 queues:pipeline_..._rese Screenshot_2020-12-03_at_01.14.38

Acceptance criteria:

  • Evaluate the analysis report and the queries. If applies create a new issue for infradev or datastores to propose new improvements for the database cluster overall.
Edited by Jose Finotto