Skip to content

Export issues in CSV hits database statement timeout

In gitlab-com/gl-infra/production#6722 (comment 918929222), I noticed that attempting to export issues to a CSV results in a database timeout:

D, [2022-04-20T21:44:46.025401 #258228] DEBUG -- :   Issue Load (15119.1ms)  /*application:console,db_config_name:main_replica*/ SELECT "issues".* FROM "issues" WHERE "issues"."project_id" = 278964 ORDER BY "issues"."id" ASC LIMIT 1000
Traceback (most recent call last):
       16: from (irb):24
       15: from (irb):5:in `csv_data'
       14: from lib/csv_builder.rb:39:in `render'
       13: from lib/csv_builder.rb:42:in `block in render'
       12: from lib/csv_builder.rb:104:in `write_csv'
       11: from lib/csv_builder.rb:78:in `each'
       10: from lib/gitlab/database/load_balancing/connection_proxy.rb:88:in `method_missing'
        9: from lib/gitlab/database/load_balancing/connection_proxy.rb:118:in `write_using_load_balancer'
        8: from lib/gitlab/database/load_balancing/load_balancer.rb:110:in `read_write'
        7: from lib/gitlab/database/load_balancing/load_balancer.rb:179:in `retry_with_backoff'
        6: from lib/gitlab/database/load_balancing/load_balancer.rb:112:in `block in read_write'
        5: from lib/gitlab/database/load_balancing/connection_proxy.rb:119:in `block in write_using_load_balancer'
        4: from lib/gitlab/database/load_balancing/connection_proxy.rb:47:in `select_all'
        3: from lib/gitlab/database/load_balancing/connection_proxy.rb:102:in `read_using_load_balancer'
        2: from lib/gitlab/database/load_balancing/load_balancer.rb:55:in `read'
        1: from lib/gitlab/database/load_balancing/connection_proxy.rb:103:in `block in read_using_load_balancer'
ActiveRecord::QueryCanceled (PG::QueryCanceled: ERROR:  canceling statement due to statement timeout)

It looks like the find_each with the ORDER BY id is killing performance here. We should use EachBatch instead:

diff --git a/app/services/issuable/export_csv/base_service.rb b/app/services/issuable/export_csv/base_service.rb
index 49ff05935c9..38a065fe1c3 100644
--- a/app/services/issuable/export_csv/base_service.rb
+++ b/app/services/issuable/export_csv/base_service.rb
@@ -22,7 +22,7 @@ def csv_data
       # rubocop: disable CodeReuse/ActiveRecord
       def csv_builder
         @csv_builder ||=
-          CsvBuilder.new(issuables.preload(associations_to_preload), header_to_value_hash)
+          CsvBuilder.new(issuables, header_to_value_hash, associations_to_preload)
       end
       # rubocop: enable CodeReuse/ActiveRecord
 
diff --git a/lib/csv_builder.rb b/lib/csv_builder.rb
index f270f7984da..fa9ead865e4 100644
--- a/lib/csv_builder.rb
+++ b/lib/csv_builder.rb
@@ -27,11 +27,12 @@ class CsvBuilder
   # The value method will be called once for each object in the collection, to
   # determine the value for that row. It can either be the name of a method on
   # the object, or a lamda to call passing in the object.
-  def initialize(collection, header_to_value_hash)
+  def initialize(collection, header_to_value_hash, associations_to_preload)
     @header_to_value_hash = header_to_value_hash
     @collection = collection
     @truncated = false
     @rows_written = 0
+    @associations_to_preload = associations_to_preload
   end
 
   # Renders the csv to a string
@@ -75,7 +76,9 @@ def status
   protected
 
   def each(&block)
-    @collection.find_each(&block) # rubocop: disable CodeReuse/ActiveRecord
+    @collection.each_batch(order_hint: :created_at) do |relation|
+      relation.preload(@associations_to_preload).each { |obj| yield obj }
+    end
   end
 
   private