Create offline transfer export worker

What does this MR do and why?

Creates offline export worker classes and supporting infrastructure for asynchronous processing of offline exports. Adds a new Import::Offline::ExportWorker that processes exports in the background, along with a ProcessService that handles descendant group/project discovery and export orchestration. This MR does not include changes to direct transfer export components to upload records directly to object storage.

Key Changes

New worker & service:

  • Import::Offline::Exports::ProcessService - Analogous to BulkImports::ProcessService but for offline transfer exports. This service begins the export process by ensuring all bulk_import_export records for all portable portabless self relations are craeted in :pending state. Then it utilizes existing direct transfer architecture, BulkImports::ExportService to export all relations for each portable.
  • Import::Offline::ExportWorker - Sidekiq worker that processes offline exports asynchronously

Updates to existing services:

  • Import::Offline::Exports::CreateService now creates self relations for portables provided by the user to track which groups and projects are to be included in the export. Some minor refactors were included to other areas of the service for slightly cleaner code.
  • BulkImports::ExportService was updated to return error messages for more efficient debugging

Model updates:

  • Added completed? method to both BulkImports::Export and Import::Offline::Export models, both used in Import::Offline::Exports::ProcessService
  • Added for_offline_export_and_relation scope to BulkImports::Export to query by self relations belonging to an offline export in Import::Offline::Exports::ProcessService

Database:

  • Added index on bulk_import_exports(offline_export_id, relation) so that for_offline_export_and_relation is efficient

References

How to set up and validate locally

  1. Enable the offline_transfer_exports feature flag.
  2. Switch to the 576092-create-offline-export-worker-classes branch and run the db migration bin/rails db:migrate
  3. Configure your GDK to use Minio for object storage.
  4. Set the MinIO region to gdk and create a bucket.
  5. Trigger an Offline Export.
# You can set whatever value you want for `source_hostname`. It's being
# removed in https://gitlab.com/gitlab-org/gitlab/-/merge_requests/221644
curl --request POST \
  --url "http://gdk.test:3000/api/v4/offline_exports" \
  --header "PRIVATE-TOKEN: $GITLAB_DEV_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
    "bucket": "offline-transfer-export",
    "source_hostname": "https://gdk.test:3000",
    "s3_compatible_configuration": {
      "aws_access_key_id": "MINIO_ACCESS_KEY",
      "aws_secret_access_key": "MINIO_SECRET_KEY",
      "region": "gdk",
      "path_style": true,
      "endpoint": "http://127.0.0.1:9000"
    },
    "entities": [
      {
        "full_path": "gitlab-org",
        "full_path": "flightjs/Flight"
      }
    ]
  }'
  1. Track the creation of Import::Offline::Export and BulkImports::Export records in the console.
  2. Track the status field of the created records as export progresses.
  3. Verify that the exported files are uploaded to object storage. Note that until #585537 (closed) is complete, objects will be uploaded to uploads/bulk_imports/export_upload/export_file/

Database plans

::BulkImports::Export.insert_all(portable_self_relations)
INSERT INTO "bulk_import_exports" ("offline_export_id", "user_id", "relation", "group_id", "project_id", "created_at", "updated_at")
VALUES
    (117, 1, 'self', 217, NULL, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP),
    (117, 1, 'self', 218, NULL, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP),
    (117, 1, 'self', 219, NULL, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP),
    (117, 1, 'self', 220, NULL, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP),
    (117, 1, 'self', 221, NULL, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP),
    (117, 1, 'self', 222, NULL, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP),
    (117, 1, 'self', NULL, 72, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP),
    (117, 1, 'self', NULL, 73, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP),
    (117, 1, 'self', NULL, 74, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP),
    (117, 1, 'self', NULL, 71, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP),
    (117, 1, 'self', NULL, 75, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP)
ON CONFLICT
    DO NOTHING
RETURNING
    "id"
Insert on bulk_import_exports  (cost=0.00..0.17 rows=11 width=163)  
  Conflict Resolution: NOTHING  
  ->  Values Scan on "**VALUES**"  (cost=0.00..0.17 rows=11 width=163)  
self_relation_export.update!
UPDATE
    "bulk_import_exports"
SET
    "updated_at" = '2026-02-05 04:36:54.939200',
    "status" = -1,
    "error" = 'failedToExport'
WHERE
    "bulk_import_exports"."id" = 775
Update on bulk_import_exports  (cost=0.43..3.45 rows=0 width=0)  
  ->  Index Scan using bulk_import_exports_pkey on bulk_import_exports  (cost=0.43..3.45 rows=1 width=48)  
        Index Cond: (id = 775) 
update!(has_failures: true)
UPDATE
    "import_offline_exports"
SET
    "updated_at" = '2026-02-06 09:37:25.349306',
    "has_failures" = TRUE
WHERE
    "import_offline_exports"."id" = 2
Update on import_offline_exports  (cost=0.00..0.00 rows=0 width=0)  
  ->  Seq Scan on import_offline_exports  (cost=0.00..0.00 rows=1 width=15)  
        Filter: (id = 2)  
BulkImports::Export.for_offline_export_and_relation
SELECT
    "bulk_import_exports".*
FROM
    "bulk_import_exports"
WHERE
    "bulk_import_exports"."offline_export_id" = 118
    AND "bulk_import_exports"."relation" = 'self';
 Index Scan using index_bulk_import_exports_on_offline_export_id_and_relation on bulk_import_exports  (cost=0.15..2.17 rows=1 width=136)
   Index Cond: (offline_export_id = 118)
(2 rows)

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #576092 (closed)

Edited by Carla Drago

Merge request reports

Loading