Geo: Replicate replicate BulkImports::ExportUpload uploads
What does this MR do and why?
This adds Geo replication and verification support for BulkImports::ExportUpload uploads (bulk_import_export_upload_uploads partition table), which keeps bulk import/export archive files synchronized across GitLab instances.
The changes create new database tables to track the synchronization and verification status of bulk import export files. This includes adding verification states to ensure files are properly copied between different GitLab servers, along with checksums to verify file integrity.
The update also adds new monitoring metrics so administrators can track how many BulkImports::ExportUpload uploads have been successfully synchronized, failed, or are pending verification across their GitLab instances. Additionally, it updates the API documentation and GraphQL schema to include these new statistics in Geo status reports.
This enhancement ensures that bulk import/export uploads are properly backed up and synchronized across multiple GitLab installations for disaster recovery purposes.
Generated using scripts/geo/generate-blob-replicator with manually shortened index names to fit PostgreSQL's 63-character identifier limit.
References
- Closes #589906 (closed)
- Parent epic: gitlab-org#20933
- Geo SSF docs: https://docs.gitlab.com/ee/development/geo/framework.html
How to set up and validate locally
Prerequisites
A working Geo GDK setup with both primary and secondary running. Follow the Geo development setup guide.
1. Run database migrations
rails db:migrate # on the primary
rails db:migrate:geo # on the secondary2. Enable the feature flags on the primary
# In Rails console on the primary
Feature.enable(:geo_bulk_import_export_upload_upload_replication)
Feature.enable(:geo_bulk_import_export_upload_upload_force_primary_checksumming)3. Create test data on the primary
Trigger a bulk import export via the API, or use the Rails console:
# In Rails console on the primary
user = User.admins.first
group = Group.first
BulkImports::ExportService.new(portable: group, user: user).executeVerify the upload exists in the bulk_import_export_upload_uploads partition:
Geo::BulkImportExportUploadUpload.count
# Should be > 04. Verify checksumming on the primary
Wait for the verification worker to process, or trigger it manually:
# In Rails console on the primary
Geo::BulkImportExportUploadUpload.first.replicator.verify
Geo::BulkImportExportUploadUpload.first.bulk_import_export_upload_upload_state.reload
Geo::BulkImportExportUploadUpload.first.bulk_import_export_upload_upload_state.verification_state
# Should be 2 (verification_succeeded)5. Verify replication on the secondary
Once the upload is created on the primary, Geo will automatically replicate it to the secondary. Check the sync status in the secondary Rails console:
# In Rails console on the secondary
Geo::BulkImportExportUploadUploadRegistry.count
# Should be > 0
registry = Geo::BulkImportExportUploadUploadRegistry.last
registry.state
# Should be 2 (synced)If the registry is empty or not yet synced, you can manually trigger sync:
# In Rails console on the secondary
Geo::BulkImportExportUploadUploadReplicator.new(model_record_id: Geo::BulkImportExportUploadUpload.first.id).sync6. Verify verification on the secondary
# In Rails console on the secondary
registry = Geo::BulkImportExportUploadUploadRegistry.last
registry.reload
registry.verification_state
# Should be 2 (verification_succeeded)7. Test GraphQL API on the secondary
Note: You must be logged in as an admin user. Non-admin users will get
nullfor Geo-related queries.
Note: When querying from the secondary's GraphQL explorer, add a custom header
REQUEST_PATHwith the value/api/v4/geo/node_proxy/{node_id}/graphql.
Open the GraphQL explorer on the secondary instance (http://<secondary-url>/-/graphql-explorer) and run:
query {
geoNode {
name
primary
bulkImportExportUploadUploadRegistries {
nodes {
id
state
verificationState
bulkImportExportUploadUploadId
lastSyncedAt
verifiedAt
}
}
}
}Expected result: you should see registry entries with state: "SYNCED" and verificationState: "VERIFIED".
8. Verify Geo Sites API
Check the Geo Sites API includes the new statistics:
curl --header "PRIVATE-TOKEN: <your-token>" "http://<primary-url>/api/v4/geo_sites/status"Look for the new fields in the response:
bulk_import_export_upload_uploads_countbulk_import_export_upload_uploads_checksummed_countbulk_import_export_upload_uploads_checksum_failed_countbulk_import_export_upload_uploads_synced_countbulk_import_export_upload_uploads_failed_countbulk_import_export_upload_uploads_registry_countbulk_import_export_upload_uploads_synced_in_percentagebulk_import_export_upload_uploads_verified_in_percentage
9. Verify Geo admin page
Visit /admin/geo/sites on the secondary and confirm that "Bulk Import Export Upload Upload" appears as a new data type with replication and verification progress.
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.