Create offline transfer export worker
What does this MR do and why?
Creates offline export worker classes and supporting infrastructure for asynchronous processing of offline exports. Adds a new Import::Offline::ExportWorker that processes exports in the background, along with a ProcessService that handles descendant group/project discovery and export orchestration. This MR does not include changes to direct transfer export components to upload records directly to object storage.
Key Changes
New worker & service:
-
Import::Offline::Exports::ProcessService- Analogous to BulkImports::ProcessService but for offline transfer exports. This service begins the export process by ensuring allbulk_import_exportrecords for all portable portablessselfrelations are craeted in:pendingstate. Then it utilizes existing direct transfer architecture,BulkImports::ExportServiceto export all relations for each portable. -
Import::Offline::ExportWorker- Sidekiq worker that processes offline exports asynchronously
Updates to existing services:
-
Import::Offline::Exports::CreateServicenow createsselfrelations for portables provided by the user to track which groups and projects are to be included in the export. Some minor refactors were included to other areas of the service for slightly cleaner code. -
BulkImports::ExportServicewas updated to return error messages for more efficient debugging
Model updates:
- Added
completed?method to bothBulkImports::ExportandImport::Offline::Exportmodels, both used inImport::Offline::Exports::ProcessService - Added
for_offline_export_and_relationscope toBulkImports::Exportto query by self relations belonging to an offline export inImport::Offline::Exports::ProcessService
Database:
- Added index on
bulk_import_exports(offline_export_id, relation)so thatfor_offline_export_and_relationis efficient
References
- Related to #576092 (closed)
- Offline transfer ADR: https://handbook.gitlab.com/handbook/engineering/architecture/design-documents/offline_direct_transfer_migrations/
- Upcoming related issues:
How to set up and validate locally
- Enable the
offline_transfer_exportsfeature flag. - Switch to the
576092-create-offline-export-worker-classesbranch and run the db migrationbin/rails db:migrate - Configure your GDK to use Minio for object storage.
- Set the MinIO region to
gdkand create a bucket. - Trigger an Offline Export.
# You can set whatever value you want for `source_hostname`. It's being
# removed in https://gitlab.com/gitlab-org/gitlab/-/merge_requests/221644
curl --request POST \
--url "http://gdk.test:3000/api/v4/offline_exports" \
--header "PRIVATE-TOKEN: $GITLAB_DEV_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"bucket": "offline-transfer-export",
"source_hostname": "https://gdk.test:3000",
"s3_compatible_configuration": {
"aws_access_key_id": "MINIO_ACCESS_KEY",
"aws_secret_access_key": "MINIO_SECRET_KEY",
"region": "gdk",
"path_style": true,
"endpoint": "http://127.0.0.1:9000"
},
"entities": [
{
"full_path": "gitlab-org",
"full_path": "flightjs/Flight"
}
]
}'
- Track the creation of
Import::Offline::ExportandBulkImports::Exportrecords in the console. - Track the
statusfield of the created records as export progresses. - Verify that the exported files are uploaded to object storage. Note that until #585537 (closed) is complete, objects will be uploaded to
uploads/bulk_imports/export_upload/export_file/
Database plans
::BulkImports::Export.insert_all(portable_self_relations)
INSERT INTO "bulk_import_exports" ("offline_export_id", "user_id", "relation", "group_id", "project_id", "created_at", "updated_at")
VALUES
(117, 1, 'self', 217, NULL, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP),
(117, 1, 'self', 218, NULL, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP),
(117, 1, 'self', 219, NULL, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP),
(117, 1, 'self', 220, NULL, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP),
(117, 1, 'self', 221, NULL, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP),
(117, 1, 'self', 222, NULL, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP),
(117, 1, 'self', NULL, 72, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP),
(117, 1, 'self', NULL, 73, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP),
(117, 1, 'self', NULL, 74, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP),
(117, 1, 'self', NULL, 71, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP),
(117, 1, 'self', NULL, 75, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP)
ON CONFLICT
DO NOTHING
RETURNING
"id"
Insert on bulk_import_exports (cost=0.00..0.17 rows=11 width=163)
Conflict Resolution: NOTHING
-> Values Scan on "**VALUES**" (cost=0.00..0.17 rows=11 width=163)
self_relation_export.update!
UPDATE
"bulk_import_exports"
SET
"updated_at" = '2026-02-05 04:36:54.939200',
"status" = -1,
"error" = 'failedToExport'
WHERE
"bulk_import_exports"."id" = 775
Update on bulk_import_exports (cost=0.43..3.45 rows=0 width=0)
-> Index Scan using bulk_import_exports_pkey on bulk_import_exports (cost=0.43..3.45 rows=1 width=48)
Index Cond: (id = 775)
update!(has_failures: true)
UPDATE
"import_offline_exports"
SET
"updated_at" = '2026-02-06 09:37:25.349306',
"has_failures" = TRUE
WHERE
"import_offline_exports"."id" = 2
Update on import_offline_exports (cost=0.00..0.00 rows=0 width=0)
-> Seq Scan on import_offline_exports (cost=0.00..0.00 rows=1 width=15)
Filter: (id = 2)
BulkImports::Export.for_offline_export_and_relation
SELECT
"bulk_import_exports".*
FROM
"bulk_import_exports"
WHERE
"bulk_import_exports"."offline_export_id" = 118
AND "bulk_import_exports"."relation" = 'self';
Index Scan using index_bulk_import_exports_on_offline_export_id_and_relation on bulk_import_exports (cost=0.15..2.17 rows=1 width=136)
Index Cond: (offline_export_id = 118)
(2 rows)
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Related to #576092 (closed)
Edited by Carla Drago