Populate blocking issues count
What does this MR do?
Backfill blocking_issues_count for issues. This column was introduced to make it easier to order issues
by the number of blocks.
This is scheduling background migrations for batch of issues. We also introduce temporary indexes to help scheduling and execution. We need to remove them after execution.
We have ~8000 open issues that have outdated blocking issues count, which should give us ~80 jobs with 2 minutes interval between each one. It should take ~1:40 hours to run.
Example of queries to be ran on the first batch
Finds the first id of issue to be updated(Should run once)
SELECT "issues"."id" FROM "issues" WHERE (id IN(
SELECT issue_links.source_id AS blocking_issue_id
FROM issue_links
INNER JOIN issues ON issue_links.source_id = issues.id
WHERE issue_links.link_type = 1
AND issues.state_id = 1
AND issues.blocking_issues_count = 0
UNION
SELECT issue_links.target_id AS blocking_issue_id
FROM issue_links
INNER JOIN issues ON issue_links.target_id = issues.id
WHERE issue_links.link_type = 2
AND issues.state_id = 1
AND issues.blocking_issues_count = 0
)) ORDER BY "issues"."id" ASC LIMIT 1
Finds the last id of issue to be updated(should run for each batch)
SELECT "issues"."id" FROM "issues" WHERE (id IN(
SELECT source_id AS blocking_issue_id FROM issue_links WHERE link_type = 1
UNION
SELECT target_id AS blocking_issue_id FROM issue_links WHERE link_type = 2
)) AND "issues"."state_id" = 1 AND "issues"."blocking_issues_count" = 0 AND "issues"."id" >= 27481306 ORDER BY "issues"."id" ASC LIMIT 1 OFFSET 500
Finds max and min ID of issues to be updated(should run for each batch)
SELECT MIN(id), MAX(id) FROM "issues" WHERE (id IN(
SELECT issue_links.source_id AS blocking_issue_id
FROM issue_links
INNER JOIN issues ON issue_links.source_id = issues.id
WHERE issue_links.link_type = 1
AND issues.state_id = 1
AND issues.blocking_issues_count = 0
UNION
SELECT issue_links.target_id AS blocking_issue_id
FROM issue_links
INNER JOIN issues ON issue_links.target_id = issues.id
WHERE issue_links.link_type = 2
AND issues.state_id = 1
AND issues.blocking_issues_count = 0
)) AND "issues"."id" >= 3532901 AND "issues"."id" < 13559706
Query executed for each batch on background migration
UPDATE issues
SET blocking_issues_count = grouped_counts.count
FROM
(
SELECT blocking_issue_id, SUM(blocked_count) AS count
FROM (
SELECT COUNT(*) AS blocked_count, issue_links.source_id AS blocking_issue_id
FROM issue_links
INNER JOIN issues ON issue_links.source_id = issues.id
WHERE issue_links.link_type = 1
AND issues.state_id = 1
AND issues.blocking_issues_count = 0
AND issue_links.source_id BETWEEN 3532901 AND 13559646
GROUP BY blocking_issue_id HAVING COUNT(*) > 0
UNION ALL
SELECT COUNT(*) AS blocked_count, issue_links.target_id AS blocking_issue_id
FROM issue_links
INNER JOIN issues ON issue_links.target_id = issues.id
WHERE issue_links.link_type = 2
AND issues.state_id = 1
AND issues.blocking_issues_count = 0
AND issue_links.target_id BETWEEN 3532901 AND 13559646
GROUP BY blocking_issue_id HAVING COUNT(*) > 0
) blocking_counts
GROUP BY blocking_issue_id
) AS grouped_counts
WHERE issues.blocking_issues_count = 0 AND issues.state_id = 1 AND issues.id = grouped_counts.blocking_issue_
Migration output
== 20200912193210 AddSchedulingIssuesTempIndexes: migrating ===================
-- transaction_open?()
-> 0.0000s
-- index_exists?(:issue_links, [:source_id], {:where=>"link_type = 1", :name=>"tmp_idx_blocking_type_links", :algorithm=>:concurrently})
-> 0.0034s
-- add_index(:issue_links, [:source_id], {:where=>"link_type = 1", :name=>"tmp_idx_blocking_type_links", :algorithm=>:concurrently})
-> 0.0032s
-- transaction_open?()
-> 0.0000s
-- index_exists?(:issue_links, [:target_id], {:where=>"link_type = 2", :name=>"tmp_idx_blocked_by_type_links", :algorithm=>:concurrently})
-> 0.0034s
-- add_index(:issue_links, [:target_id], {:where=>"link_type = 2", :name=>"tmp_idx_blocked_by_type_links", :algorithm=>:concurrently})
-> 0.0030s
-- transaction_open?()
-> 0.0000s
-- index_exists?(:issues, :id, {:where=>"(state_id = 1 AND blocking_issues_count = 0)", :name=>"tmp_idx_index_issues_with_outdate_blocking_count", :algorithm=>:concurrently})
-> 0.0144s
-- add_index(:issues, :id, {:where=>"(state_id = 1 AND blocking_issues_count = 0)", :name=>"tmp_idx_index_issues_with_outdate_blocking_count", :algorithm=>:concurrently})
-> 0.0035s
== 20200912193210 AddSchedulingIssuesTempIndexes: migrated (0.0335s) ==========
== 20200914185610 ScheduleSyncBlockingIssuesCount: migrating ==================
== 20200914185610 ScheduleSyncBlockingIssuesCount: migrated (0.0043s) =========
Related to #237977 (closed)
Edited by Felipe Cardozo