BulkImports::PipelineWorker often fails with NetworkError 'execution expired'
Summary
While testing GitLab Migration (https://docs.gitlab.com/ee/user/group/import/) on .com and migrating this group to another group on .com I noticed while the migration succeeds, each run there are import pipelines (that are responsible for migrating the data over) fail with NetworkError 'execution expired'.
Example:
[#<BulkImports::Failure:0x00007f9f78a47bc0
id: 10149,
bulk_import_entity_id: 10381,
created_at: Tue, 26 Apr 2022 10:29:31.402612000 UTC +00:00,
pipeline_class: "BulkImports::Common::Pipelines::BadgesPipeline",
exception_class: "BulkImports::NetworkError",
exception_message: "execution expired",
correlation_id_value: "dce8fbb576e302200ab7bb277ebdb21c",
pipeline_step: "extractor">,
#<BulkImports::Failure:0x00007f9f78985cf0
id: 10148,
bulk_import_entity_id: 10387,
created_at: Tue, 26 Apr 2022 10:29:31.233230000 UTC +00:00,
pipeline_class: "BulkImports::Groups::Pipelines::SubgroupEntitiesPipeline",
exception_class: "BulkImports::NetworkError",
exception_message: "execution expired",
correlation_id_value: "ae940429bff2af5920028d5772034b6e",
pipeline_step: "extractor">,
#<BulkImports::Failure:0x00007f9f788f51a0
id: 10154,
bulk_import_entity_id: 10391,
created_at: Tue, 26 Apr 2022 10:29:43.562074000 UTC +00:00,
pipeline_class: "BulkImports::Common::Pipelines::MembersPipeline",
exception_class: "BulkImports::NetworkError",
exception_message: "execution expired",
correlation_id_value: "ae940429bff2af5920028d5772034b6e",
pipeline_step: "extractor">,
#<BulkImports::Failure:0x00007f9f788d6e80
id: 10155,
bulk_import_entity_id: 10392,
created_at: Tue, 26 Apr 2022 10:29:45.622423000 UTC +00:00,
pipeline_class: "BulkImports::Projects::Pipelines::CiPipelinesPipeline",
exception_class: "BulkImports::NetworkError",
exception_message: "execution expired",
correlation_id_value: "ae940429bff2af5920028d5772034b6e",
pipeline_step: "extractor">,
#<BulkImports::Failure:0x00007f9f78845250
id: 10159,
bulk_import_entity_id: 10397,
created_at: Tue, 26 Apr 2022 10:30:16.262376000 UTC +00:00,
pipeline_class:
"BulkImports::Projects::Pipelines::ContainerExpirationPolicyPipeline",
exception_class: "BulkImports::NetworkError",
exception_message: "execution expired",
correlation_id_value: "dd2c90bb4b9deb5e233a0865a6c523bf",
pipeline_step: "extractor">]
Because of these network errors some of the data is not carried over.
Example Group to investigate
https://gitlab.com/georgekoltsov-bulk-import-group
What is the current bug behavior?
Network 'execution expired' error while gitlab migration is running.
What is the expected correct behavior?
There shouldn't be as frequent and consistent network 'execution expired' errors. Some are inevitable, but not this many. It's consistently happenning during each run.
Relevant logs and/or screenshots
Output of checks
This bug happens on GitLab.com
Possible fixes
- Add retry logic (we already have retry logic present, which does not look like it kicks in?) for failing pipelines in order to increase pipeline success rate
- Understand why execution expired is happening. Is it due to .com request queue latency? What is current network request timeout value? Should it be bumped?
