Skip to content

N+1 SQL query and significant performance regression in ExportCsvService

I attempted to export all issues to a CSV in a console via #294504, and I noticed this repeated query:

D, [2021-02-16T05:05:04.356127 #15891] DEBUG -- :   CACHE  (0.0ms)  SELECT "issues"."id", labels.title FROM "issues" LEFT OUTER JOIN "label_links" ON "label_links"."target_type" = 'Issue' AND "label_links"."target_id" = "issues"."id" LEFT OUTER JOIN "labels" ON "labels"."id" = "label_links"."label_id" WHERE "issues"."id" IN (SELECT "issues"."id" FROM "issues" WHERE "issues"."project_id" = 278964)
D, [2021-02-16T05:05:04.958176 #15891] DEBUG -- :   CACHE  (0.0ms)  SELECT "issues"."id", labels.title FROM "issues" LEFT OUTER JOIN "label_links" ON "label_links"."target_type" = 'Issue' AND "label_links"."target_id" = "issues"."id" LEFT OUTER JOIN "labels" ON "labels"."id" = "label_links"."label_id" WHERE "issues"."id" IN (SELECT "issues"."id" FROM "issues" WHERE "issues"."project_id" = 278964)
D, [2021-02-16T05:05:05.514743 #15891] DEBUG -- :   CACHE  (0.0ms)  SELECT "issues"."id", labels.title FROM "issues" LEFT OUTER JOIN "label_links" ON "label_links"."target_type" = 'Issue' AND "label_links"."target_id" = "issues"."id" LEFT OUTER JOIN "labels" ON "labels"."id" = "label_links"."label_id" WHERE "issues"."id" IN (SELECT "issues"."id" FROM "issues" WHERE "issues"."project_id" = 278964)
D, [2021-02-16T05:05:06.114256 #15891] DEBUG -- :   CACHE  (0.0ms)  SELECT "issues"."id", labels.title FROM "issues" LEFT OUTER JOIN "label_links" ON "label_links"."target_type" = 'Issue' AND "label_links"."target_id" = "issues"."id" LEFT OUTER JOIN "labels" ON "labels"."id" = "label_links"."label_id" WHERE "issues"."id" IN (SELECT "issues"."id" FROM "issues" WHERE "issues"."project_id" = 278964)
D, [2021-02-16T05:05:06.668595 #15891] DEBUG -- :   CACHE  (0.0ms)  SELECT "issues"."id", labels.title FROM "issues" LEFT OUTER JOIN "label_links" ON "label_links"."target_type" = 'Issue' AND "label_links"."target_id" = "issues"."id" LEFT OUTER JOIN "labels" ON "labels"."id" = "label_links"."label_id" WHERE "issues"."id" IN (SELECT "issues"."id" FROM "issues" WHERE "issues"."project_id" = 278964)
D, [2021-02-16T05:05:07.271299 #15891] DEBUG -- :   CACHE  (0.0ms)  SELECT "issues"."id", labels.title FROM "issues" LEFT OUTER JOIN "label_links" ON "label_links"."target_type" = 'Issue' AND "label_links"."target_id" = "issues"."id" LEFT OUTER JOIN "labels" ON "labels"."id" = "label_links"."label_id" WHERE "issues"."id" IN (SELECT "issues"."id" FROM "issues" WHERE "issues"."project_id" = 278964)
D, [2021-02-16T05:05:07.824376 #15891] DEBUG -- :   CACHE  (0.0ms)  SELECT "issues"."id", labels.title FROM "issues" LEFT OUTER JOIN "label_links" ON "label_links"."target_type" = 'Issue' AND "label_links"."target_id" = "issues"."id" LEFT OUTER JOIN "labels" ON "labels"."id" = "label_links"."label_id" WHERE "issues"."id" IN (SELECT "issues"."id" FROM "issues" WHERE "issues"."project_id" = 278964)
D, [2021-02-16T05:05:08.424815 #15891] DEBUG -- :   CACHE  (0.0ms)  SELECT "issues"."id", labels.title FROM "issues" LEFT OUTER JOIN "label_links" ON "label_links"."target_type" = 'Issue' AND "label_links"."target_id" = "issues"."id" LEFT OUTER JOIN "labels" ON "labels"."id" = "label_links"."label_id" WHERE "issues"."id" IN (SELECT "issues"."id" FROM "issues" WHERE "issues"."project_id" = 278964)
D, [2021-02-16T05:05:08.982852 #15891] DEBUG -- :   CACHE  (0.0ms)  SELECT "issues"."id", labels.title FROM "issues" LEFT OUTER JOIN "label_links" ON "label_links"."target_type" = 'Issue' AND "label_links"."target_id" = "issues"."id" LEFT OUTER JOIN "labels" ON "labels"."id" = "label_links"."label_id" WHERE "issues"."id" IN (SELECT "issues"."id" FROM "issues" WHERE "issues"."project_id" = 278964)
D, [2021-02-16T05:05:09.587501 #15891] DEBUG -- :   CACHE  (0.0ms)  SELECT "issues"."id", labels.title FROM "issues" LEFT OUTER JOIN "label_links" ON "label_links"."target_type" = 'Issue' AND "label_links"."target_id" = "issues"."id" LEFT OUTER JOIN "labels" ON "labels"."id" = "label_links"."label_id" WHERE "issues"."id" IN (SELECT "issues"."id" FROM "issues" WHERE "issues"."project_id" = 278964)
D, [2021-02-16T05:05:10.142793 #15891] DEBUG -- :   CACHE  (0.0ms)  SELECT "issues"."id", labels.title FROM "issues" LEFT OUTER JOIN "label_links" ON "label_links"."target_type" = 'Issue' AND "label_links"."target_id" = "issues"."id" LEFT OUTER JOIN "labels" ON "labels"."id" = "label_links"."label_id" WHERE "issues"."id" IN (SELECT "issues"."id" FROM "issues" WHERE "issues"."project_id" = 278964)
D, [2021-02-16T05:05:10.739471 #15891] DEBUG -- :   CACHE  (0.0ms)  SELECT "issues"."id", labels.title FROM "issues" LEFT OUTER JOIN "label_links" ON "label_links"."target_type" = 'Issue' AND "label_links"."target_id" = "issues"."id" LEFT OUTER JOIN "labels" ON "labels"."id" = "label_links"."label_id" WHERE "issues"."id" IN (SELECT "issues"."id" FROM "issues" WHERE "issues"."project_id" = 278964)
D, [2021-02-16T05:05:11.294756 #15891] DEBUG -- :   CACHE  (0.0ms)  SELECT "issues"."id", labels.title FROM "issues" LEFT OUTER JOIN "label_links" ON "label_links"."target_type" = 'Issue' AND "label_links"."target_id" = "issues"."id" LEFT OUTER JOIN "labels" ON "labels"."id" = "label_links"."label_id" WHERE "issues"."id" IN (SELECT "issues"."id" FROM "issues" WHERE "issues"."project_id" = 278964)

I think this started with !50449 (merged):

    def issue_labels(issue)
-     @labels[issue.id].sort.join(',').presence
+     issuables.labels_hash[issue.id].sort.join(',').presence

Since issuables is a relation, we will constantly have to load this query over again.

Is there a reason we got rid of memoizing issues.labels_hash?

/cc: @egrieff