BulkImports not retrying batches if all failed
Summary
When using bulk_import's batching feature, if all batches fail roughly at the same time (for example connection issue with source). The import will not retry the batches.
Steps to reproduce
To mimic all the batches failing. We can use pipeline schedules
- Create 7 pipelines schedules (arbitrary number)
- Change BATCH_SIZE = 3
- Make runner fail with retry-able error (this simulates connection loss with source)
diff --git a/app/services/bulk_imports/batched_relation_export_service.rb b/app/services/bulk_imports/batched_relation_export_service.rb
index 778510f2e358..a8761df0071a 100644
--- a/app/services/bulk_imports/batched_relation_export_service.rb
+++ b/app/services/bulk_imports/batched_relation_export_service.rb
@@ -4,7 +4,7 @@ module BulkImports
class BatchedRelationExportService
include Gitlab::Utils::StrongMemoize
- BATCH_SIZE = 1000
+ BATCH_SIZE = 3
BATCH_CACHE_KEY = 'bulk_imports/batched_relation_export/%{export_id}/%{batch_id}'
CACHE_DURATION = 4.hours
diff --git a/lib/bulk_imports/pipeline/extracted_data.rb b/lib/bulk_imports/pipeline/extracted_data.rb
index 0b36c0682981..4e0ea3fd3b40 100644
--- a/lib/bulk_imports/pipeline/extracted_data.rb
+++ b/lib/bulk_imports/pipeline/extracted_data.rb
@@ -24,6 +24,10 @@ def next_page
def each(&block)
data.each(&block)
end
+
+ def each_with_index(&block)
+ data.each_with_index(&block)
+ end
end
end
end
diff --git a/lib/bulk_imports/pipeline/runner.rb b/lib/bulk_imports/pipeline/runner.rb
index 1e2d91520473..d108d9350e25 100644
--- a/lib/bulk_imports/pipeline/runner.rb
+++ b/lib/bulk_imports/pipeline/runner.rb
@@ -16,6 +16,10 @@ def run
if extracted_data
extracted_data.each_with_index do |entry, index|
+ if index == 1 && context.tracker.relation == "BulkImports::Projects::Pipelines::PipelineSchedulesPipeline"
+ raise BulkImports::RetryPipelineError.new("oh no", 1.second)
+ end
+
transformers.each do |transformer|
entry = run_pipeline_step(:transformer, transformer.class.name) do
transformer.transform(context, entry)
- Trigger bulk_import direct transfer
Example Project
What is the current bug behavior?
The import goes to the next stage, whatever pipeline the batches belong to is marked as "success". No duplicated pipelineSchedules created.
What is the expected correct behavior?
Even if all the batches fail, the import should retry the batches instead of progressing to the next stage. Duplicate pipelineSchedules created
Note: This behaviour will be fixed when FF bulk_import_idempotent_workers
is enabled to remove duplicated.
Relevant logs and/or screenshots
Output of checks
Results of GitLab environment info
Expand for output related to GitLab environment info
(For installations with omnibus-gitlab package run and paste the output of: \\\\\\\\\\\\\\\`sudo gitlab-rake gitlab:env:info\\\\\\\\\\\\\\\`) (For installations from source run and paste the output of: \\\\\\\\\\\\\\\`sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production\\\\\\\\\\\\\\\`)
Results of GitLab application Check
Expand for output related to the GitLab application check
(For installations with omnibus-gitlab package run and paste the output of: \\\\\\\`sudo gitlab-rake gitlab:check SANITIZE=true\\\\\\\`) (For installations from source run and paste the output of: \\\\\\\`sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true\\\\\\\`) (we will only investigate if the tests are passing)
Possible fixes
Edited by Max Fan