Skip to content

Move clusters_applications_cert_managers to batch counting

What does this MR do?

Move ::Clusters::Applications::CertManager.where(time_period).distinct_by_user, to batch distinct count

The purpose of the change

This changes are a performance improvement of count queries and a refactoring. We prefer to use batch counting and optimize indexes if is the case in order to have the queries under 1 sec and avoid the timeouts.

Proposed solution

  • Refactor the counter to use batch counting
  • Ensure that the resulted queries are < 1 sec in database-lab
  • If queries are > 1 sec in database-lab consider adding index for optimizing the queries

Queries

time_period = {}
clusters_applications_cert_managers: distinct_count(::Clusters::Applications::CertManager.where(time_period).available.joins(:cluster), 'clusters.user_id'),
time_period = { created_at: 28.days.ago..Time.current }
clusters_applications_cert_managers: distinct_count(::Clusters::Applications::CertManager.where(time_period).available.joins(:cluster), 'clusters.user_id'),

MIN with no period

SELECT MIN(clusters.user_id)
FROM "clusters_applications_cert_managers"
INNER JOIN "clusters" ON "clusters"."id" = "clusters_applications_cert_managers"."cluster_id"
WHERE "clusters_applications_cert_managers"."status" IN (3,5)

Before: Time: 394.145 ms https://explain.depesz.com/s/gmrl

Filter: (clusters_applications_cert_managers.status = ANY ('{3,5}'::integer[]))

After:

MAX with no period

SELECT MAX(clusters.user_id)
FROM "clusters_applications_cert_managers"
INNER JOIN "clusters" ON "clusters"."id" = "clusters_applications_cert_managers"."cluster_id" WHERE "clusters_applications_cert_managers"."status" IN (3, 5)

Before: Time: 22.567 ms https://explain.depesz.com/s/jlZx

Filter: (clusters_applications_cert_managers.status = ANY ('{3,5}'::integer[]))

After:

COUNT with no period

SELECT COUNT(DISTINCT clusters.user_id)
FROM "clusters_applications_cert_managers"
INNER JOIN "clusters" ON "clusters"."id" = "clusters_applications_cert_managers"."cluster_id" WHERE "clusters_applications_cert_managers"."status" IN (3,5)
AND "clusters"."user_id" BETWEEN 0 AND 99999

Before: Time: 10.186 ms https://explain.depesz.com/s/FS3D

Filter: (clusters_applications_cert_managers.status = ANY ('{3,5}'::integer[]))

After:

MIN with period

SELECT MIN(clusters.user_id)
FROM "clusters_applications_cert_managers"
INNER JOIN "clusters" ON "clusters"."id" = "clusters_applications_cert_managers"."cluster_id" WHERE "clusters_applications_cert_managers"."created_at" BETWEEN '2020-03-06 10:40:26.923837' AND '2020-04-03 10:40:26.924025'
AND "clusters_applications_cert_managers"."status" IN (3,5)

Before: Time: 3.685 ms https://explain.depesz.com/s/CfZk

Filter: ((clusters_applications_cert_managers.created_at >= '2020-03-06 10:40:26.923837+00'::timestamp with time zone) AND (clusters_applications_cert_managers.created_at <= '2020-04-03 10:40:26.924025+00'::timestamp with time zone) AND (clusters_applications_cert_managers.status = ANY ('{3,5}'::integer[])))

After:

MAX with period

SELECT MAX(clusters.user_id)
FROM "clusters_applications_cert_managers"
INNER JOIN "clusters" ON "clusters"."id" = "clusters_applications_cert_managers"."cluster_id" WHERE "clusters_applications_cert_managers"."created_at" BETWEEN '2020-03-06 10:40:26.923837' AND '2020-04-03 10:40:26.924025'
AND "clusters_applications_cert_managers"."status" IN (3,5)

Before: Time: 2.556 ms https://explain.depesz.com/s/URYx

Filter: ((clusters_applications_cert_managers.created_at >= '2020-03-06 10:40:26.923837+00'::timestamp with time zone) AND (clusters_applications_cert_managers.created_at <= '2020-04-03 10:40:26.924025+00'::timestamp with time zone) AND (clusters_applications_cert_managers.status = ANY ('{3,5}'::integer[])))

After:

COUNT with period

SELECT COUNT(DISTINCT clusters.user_id)
FROM "clusters_applications_cert_managers"
INNER JOIN "clusters" ON "clusters"."id" = "clusters_applications_cert_managers"."cluster_id" WHERE "clusters_applications_cert_managers"."created_at" BETWEEN '2020-03-06 10:40:26.923837' AND '2020-04-03 10:40:26.924025'
AND "clusters_applications_cert_managers"."status" IN (3,5)
AND "clusters"."user_id" BETWEEN 0 AND 99999

Before: Time: 2.357 ms https://explain.depesz.com/s/sxzq

Filter: ((clusters_applications_cert_managers.created_at >= '2020-03-06 10:40:26.923837+00'::timestamp with time zone) AND (clusters_applications_cert_managers.created_at <= '2020-04-03 10:40:26.924025+00'::timestamp with time zone) AND (clusters_applications_cert_managers.status = ANY ('{3,5}'::integer[])))

After:

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team

Part of #212962 (closed) Relates #212962 (closed)

Edited by Alina Mihaila

Merge request reports