Skip to content

Clusters::Applications::Knative move to batch counting

What does this MR do?

Move ::Clusters::Applications::Knative.where(time_period).distinct_by_user, to batch distinct count

The purpose of the change

This changes are a performance improvement of count queries and a refactoring. We prefer to use batch counting and optimize indexes if is the case in order to have the queries under 1 sec and avoid the timeouts.

Proposed solution

  • Refactor the counter to use batch counting
  • Ensure that the resulted queries are < 1 sec in database-lab
  • No need for optimization
time_period = {}
clusters_applications_cert_managers: distinct_count(::Clusters::Applications::Knative.where(time_period).available.joins(:cluster), 'clusters.user_id'),
time_period = { created_at: 28.days.ago..Time.current }
clusters_applications_cert_managers: distinct_count(::Clusters::Applications::Knative.where(time_period).available.joins(:cluster), 'clusters.user_id'),

Queries

MIN with no period

SELECT MIN(clusters.user_id)
FROM "clusters_applications_knative"
INNER JOIN "clusters" ON "clusters"."id" = "clusters_applications_knative"."cluster_id"
WHERE "clusters_applications_knative"."status" IN (3,5)

Time: 18.268 ms https://explain.depesz.com/s/8PxH

Filter: (clusters_applications_knative.status = ANY ('{3,5}'::integer[]))

MAX with no period

SELECT MAX(clusters.user_id)
FROM "clusters_applications_knative"
INNER JOIN "clusters" ON "clusters"."id" = "clusters_applications_knative"."cluster_id"
WHERE "clusters_applications_knative"."status" IN (3,5)

Time: 2.076 ms https://explain.depesz.com/s/EoBT

Filter: (clusters_applications_knative.status = ANY ('{3,5}'::integer[]))

COUNT with no period

SELECT COUNT(DISTINCT clusters.user_id)
FROM "clusters_applications_knative"
INNER JOIN "clusters" ON "clusters"."id" = "clusters_applications_knative"."cluster_id"
WHERE "clusters_applications_knative"."status" IN (3,5)
  AND "clusters"."user_id" BETWEEN 0 AND 9999

Time: 1.693 ms https://explain.depesz.com/s/HOGn

Filter: (clusters_applications_knative.status = ANY ('{3,5}'::integer[]))

MIN with period

SELECT MIN(clusters.user_id)
FROM "clusters_applications_knative"
INNER JOIN "clusters" ON "clusters"."id" = "clusters_applications_knative"."cluster_id"
WHERE "clusters_applications_knative"."created_at" BETWEEN '2020-03-10 07:45:10.387244' AND '2020-04-07 07:45:10.387414'
  AND "clusters_applications_knative"."status" IN (3, 5)

Time: 0.695 ms https://explain.depesz.com/s/Wouf

Filter: ((clusters_applications_knative.created_at >= '2020-03-10 07:45:10.387244+00'::timestamp with time zone) AND (clusters_applications_knative.created_at <= '2020-04-07 07:45:10.387414+00'::timestamp with time zone) AND (clusters_applications_knative.status = ANY ('{3,5}'::integer[])))

MAX with period

SELECT MAX(clusters.user_id)
FROM "clusters_applications_knative"
INNER JOIN "clusters" ON "clusters"."id" = "clusters_applications_knative"."cluster_id"
WHERE "clusters_applications_knative"."created_at" BETWEEN '2020-03-10 07:45:10.387244' AND '2020-04-07 07:45:10.387414'
  AND "clusters_applications_knative"."status" IN (3, 5)

Time: 0.579 ms https://explain.depesz.com/s/qBBo

Filter: ((clusters_applications_knative.created_at >= '2020-03-10 07:45:10.387244+00'::timestamp with time zone) AND (clusters_applications_knative.created_at <= '2020-04-07 07:45:10.387414+00'::timestamp with time zone) AND (clusters_applications_knative.status = ANY ('{3,5}'::integer[])))

COUNT with period

SELECT COUNT(DISTINCT clusters.user_id)
FROM "clusters_applications_knative"
INNER JOIN "clusters" ON "clusters"."id" = "clusters_applications_knative"."cluster_id"
WHERE "clusters_applications_knative"."created_at" BETWEEN '2020-03-10 07:45:10.387244' AND '2020-04-07 07:45:10.387414'
  AND "clusters_applications_knative"."status" IN (3,5)
  AND "clusters"."user_id" BETWEEN 0 AND 9999

Time: 0.828 ms https://explain.depesz.com/s/L7l6M

Filter: ((clusters_applications_knative.created_at >= '2020-03-10 07:45:10.387244+00'::timestamp with time zone) AND (clusters_applications_knative.created_at <= '2020-04-07 07:45:10.387414+00'::timestamp with time zone) AND (clusters_applications_knative.status = ANY ('{3,5}'::integer[])))

Part of #212962 (closed)

Edited by Alina Mihaila

Merge request reports