Export issues in CSV hits database statement timeout
In gitlab-com/gl-infra/production#6722 (comment 918929222), I noticed that attempting to export issues to a CSV results in a database timeout:
D, [2022-04-20T21:44:46.025401 #258228] DEBUG -- : Issue Load (15119.1ms) /*application:console,db_config_name:main_replica*/ SELECT "issues".* FROM "issues" WHERE "issues"."project_id" = 278964 ORDER BY "issues"."id" ASC LIMIT 1000
Traceback (most recent call last):
16: from (irb):24
15: from (irb):5:in `csv_data'
14: from lib/csv_builder.rb:39:in `render'
13: from lib/csv_builder.rb:42:in `block in render'
12: from lib/csv_builder.rb:104:in `write_csv'
11: from lib/csv_builder.rb:78:in `each'
10: from lib/gitlab/database/load_balancing/connection_proxy.rb:88:in `method_missing'
9: from lib/gitlab/database/load_balancing/connection_proxy.rb:118:in `write_using_load_balancer'
8: from lib/gitlab/database/load_balancing/load_balancer.rb:110:in `read_write'
7: from lib/gitlab/database/load_balancing/load_balancer.rb:179:in `retry_with_backoff'
6: from lib/gitlab/database/load_balancing/load_balancer.rb:112:in `block in read_write'
5: from lib/gitlab/database/load_balancing/connection_proxy.rb:119:in `block in write_using_load_balancer'
4: from lib/gitlab/database/load_balancing/connection_proxy.rb:47:in `select_all'
3: from lib/gitlab/database/load_balancing/connection_proxy.rb:102:in `read_using_load_balancer'
2: from lib/gitlab/database/load_balancing/load_balancer.rb:55:in `read'
1: from lib/gitlab/database/load_balancing/connection_proxy.rb:103:in `block in read_using_load_balancer'
ActiveRecord::QueryCanceled (PG::QueryCanceled: ERROR: canceling statement due to statement timeout)
It looks like the find_each with the ORDER BY id is killing performance here. We should use EachBatch instead:
diff --git a/app/services/issuable/export_csv/base_service.rb b/app/services/issuable/export_csv/base_service.rb
index 49ff05935c9..38a065fe1c3 100644
--- a/app/services/issuable/export_csv/base_service.rb
+++ b/app/services/issuable/export_csv/base_service.rb
@@ -22,7 +22,7 @@ def csv_data
# rubocop: disable CodeReuse/ActiveRecord
def csv_builder
@csv_builder ||=
- CsvBuilder.new(issuables.preload(associations_to_preload), header_to_value_hash)
+ CsvBuilder.new(issuables, header_to_value_hash, associations_to_preload)
end
# rubocop: enable CodeReuse/ActiveRecord
diff --git a/lib/csv_builder.rb b/lib/csv_builder.rb
index f270f7984da..fa9ead865e4 100644
--- a/lib/csv_builder.rb
+++ b/lib/csv_builder.rb
@@ -27,11 +27,12 @@ class CsvBuilder
# The value method will be called once for each object in the collection, to
# determine the value for that row. It can either be the name of a method on
# the object, or a lamda to call passing in the object.
- def initialize(collection, header_to_value_hash)
+ def initialize(collection, header_to_value_hash, associations_to_preload)
@header_to_value_hash = header_to_value_hash
@collection = collection
@truncated = false
@rows_written = 0
+ @associations_to_preload = associations_to_preload
end
# Renders the csv to a string
@@ -75,7 +76,9 @@ def status
protected
def each(&block)
- @collection.find_each(&block) # rubocop: disable CodeReuse/ActiveRecord
+ @collection.each_batch(order_hint: :created_at) do |relation|
+ relation.preload(@associations_to_preload).each { |obj| yield obj }
+ end
end
private