Skip to content

Adapt BatchCount for grouped records

Igor Drozdov requested to merge id-adapt-batch-count-for-grouped-records into master

What does this MR do?

If the relation is grouped, Rails returns Hash: { group-key => count }: https://apidock.com/rails/ActiveRecord/Calculations/count

Since batch_count expects Integer in order to sum the counts up, we receive an error. We can sum up the Hash's size in this case.

Example

Group by and count

Namespace.group(:type).count
   (7.6ms)  SELECT COUNT(*) AS count_all, "namespaces"."type" AS namespaces_type FROM "namespaces" GROUP BY "namespaces"."type"
=> {nil=>145, "Group"=>8}

Gitlab::UsageData.count(Namespace.group(:type))
   (1.6ms)  SELECT MIN("namespaces"."id") FROM "namespaces"
   (1.1ms)  SELECT MAX("namespaces"."id") FROM "namespaces"
   (0.6ms)  SELECT COUNT("namespaces"."id") AS count_id, "namespaces"."type" AS namespaces_type FROM "namespaces" WHERE "namespaces"."id" BETWEEN $1 AND $2 GROUP BY "namespaces"."type"  [["id", 1], ["id", 100000]]
=> {nil=>145, "Group"=>8}
[

Group by and count by a given column

[26] pry(main)> Project.group(:visibility_level).count(:creator_id)
   (1.6ms)  SELECT COUNT("projects"."creator_id") AS count_creator_id, "projects"."visibility_level" AS projects_visibility_level FROM "projects" GROUP BY "projects"."visibility_level"
=> {0=>3, 10=>4, 20=>11}

[29] pry(main)> Gitlab::UsageData.count(Project.group(:visibility_level), :creator_id)
   (0.7ms)  SELECT MIN("users"."id") FROM "users"
   (0.2ms)  SELECT MAX("users"."id") FROM "users"
   (0.3ms)  SELECT COUNT("projects"."creator_id") AS count_creator_id, "projects"."visibility_level" AS projects_visibility_level FROM "projects" WHERE "projects"."creator_id" BETWEEN $1 AND $2 GROUP BY "projects"."visibility_level"  [["creator_id", 1], ["creator_id", 100000]]
=> {0=>3, 10=>4, 20=>11}
[30] pry(main)> 

Group by and distinct count by a given column

[27] pry(main)> Project.distinct.group(:visibility_level).count(:creator_id)
   (1.6ms)  SELECT COUNT(DISTINCT "projects"."creator_id") AS count_creator_id, "projects"."visibility_level" AS projects_visibility_level FROM "projects" GROUP BY "projects"."visibility_level"
=> {0=>1, 10=>1, 20=>11}

[28] pry(main)> Gitlab::UsageData.distinct_count(Project.group(:visibility_level), :creator_id)
   (1.4ms)  SELECT MIN("users"."id") FROM "users"
   (1.2ms)  SELECT MAX("users"."id") FROM "users"
   (1.9ms)  SELECT COUNT(DISTINCT "projects"."creator_id") AS count_creator_id, "projects"."visibility_level" AS projects_visibility_level FROM "projects" WHERE "projects"."creator_id" BETWEEN $1 AND $2 GROUP BY "projects"."visibility_level"  [["creator_id", 1], ["creator_id", 10000]]
=> {0=>1, 10=>1, 20=>11}
Edited by Igor Drozdov

Merge request reports