Geo: Replicate Group Uploads
What does this MR do and why?
This code change adds support for tracking and replicating group file uploads in GitLab's Geo feature (which keeps multiple GitLab instances synchronized across different locations).
The changes create a new database table called "group_upload_states" that stores information about whether group uploads have been successfully copied and verified between different GitLab sites. This includes tracking when files were last checked, whether verification passed or failed, and retry counts for failed attempts.
The code also adds the necessary database migrations to create this new table with proper indexes for efficient querying, sets up foreign key relationships to link uploads with their parent groups, and includes sharding support for better performance in large deployments.
Additionally, it updates the GraphQL API to expose information about group upload replication status, adds new monitoring metrics so administrators can track how well group uploads are being synchronized, and updates documentation to reflect these new capabilities.
This enhancement extends GitLab's existing file replication system (which already handled project uploads) to also cover files uploaded at the group level, ensuring better data consistency and backup coverage across geographically distributed GitLab installations.
References
Related to #589910 (closed)
Regarding reviews and merge process for this series
This MR is one of many instances following !224245 (merged), which was produced by the same generator script. @dbalexandre has been improving the generator script with the MR feedback, and I expect he will continue to do so.
These are all behind a feature flag, so I propose that most release-blocking comments can be handled in a follow-up, which also addresses the generator script and any previous instances.
For more context, see !226569 (comment 3152345538).
How to set up and validate locally
Prerequisites
- A working Geo GDK setup with both primary and secondary running. Follow the Geo development setup guide.
1. Run database migrations
rails db:migrate # on the primary
rails db:migrate:geo # on the secondary
2. Enable the feature flags on the primary
# In Rails console on the primary
Feature.enable(:geo_group_upload_replication)
Feature.enable(:geo_group_upload_force_primary_checksumming)
3. Create test data on the primary
Upload a file to a group (e.g., attach an image to a group-level issue or epic description). Alternatively, use the Rails console:
# In Rails console on the primary
group = Group.first
file = CarrierWaveStringFile.new_file(
file_content: "Seeded upload file in group #{group.full_path}",
filename: 'seeded_upload.txt',
content_type: 'text/plain'
)
UploadService.new(group, file, NamespaceFileUploader).execute
Verify the upload exists in the namespace_uploads partition:
Geo::GroupUpload.count
# Should be > 0
4. Verify checksumming on the primary
Wait for the verification worker to process, or trigger it manually:
# In Rails console on the primary
Geo::GroupUpload.first.replicator.verify
Geo::GroupUpload.first.group_upload_state.reload
Geo::GroupUpload.first.group_upload_state.verification_state
# Should be 2 (verification_succeeded)
5. Verify replication on the secondary
Once the upload is created on the primary, Geo will automatically replicate it to the secondary. Check the sync status in the secondary Rails console:
# In Rails console on the secondary
Geo::GroupUploadRegistry.count
# Should be > 0
registry = Geo::GroupUploadRegistry.last
registry.state
# Should be 2 (synced)
If the registry is empty or not yet synced, you can manually trigger sync:
# In Rails console on the secondary
Geo::GroupUploadReplicator.new(model_record_id: Geo::GroupUpload.first.id).sync
6. Verify verification on the secondary
# In Rails console on the secondary
registry = Geo::GroupUploadRegistry.last
registry.reload
registry.verification_state
# Should be 2 (verification_succeeded)
7. Test GraphQL API on the secondary
Note: You must be logged in as an admin user. Non-admin users will get
nullfor Geo-related queries.
Note: When querying from the secondary's GraphQL explorer, add a custom header
REQUEST_PATHwith the value `/api/v4/geo/node_proxy/{node_id}/graphql
Open the GraphQL explorer on the secondary instance (http://<secondary-url>/-/graphql-explorer) and run:
query {
geoNode {
name
primary
groupUploadRegistries {
nodes {
id
state
verificationState
groupUploadId
lastSyncedAt
verifiedAt
}
}
}
}
Expected result: you should see registry entries with state: "SYNCED" and verificationState: "VERIFIED".
8. Verify Geo Sites API
Check the Geo Sites API includes the new group upload statistics:
curl --header "PRIVATE-TOKEN: <your-token>" "http://<primary-url>/api/v4/geo_sites/status"
Look for the new fields in the response:
group_uploads_countgroup_uploads_checksummed_countgroup_uploads_checksum_failed_countgroup_uploads_synced_countgroup_uploads_failed_countgroup_uploads_registry_countgroup_uploads_synced_in_percentagegroup_uploads_verified_in_percentage
9. Verify Geo admin page
Visit /admin/geo/sites on the secondary and confirm that "Group Uploads" appears as a new data type with replication and verification progress.
Database Queries
-
Selective Sync Disabled:
-
Raw SQL
Click to expand
SELECT "namespace_uploads".* FROM "namespace_uploads" WHERE "namespace_uploads"."id" BETWEEN 1 AND 10000; -
Query Plan: https://explain.depesz.com/s/nH3V
-
-
Selective Sync by Groups:
-
Raw SQL
Click to expand
SELECT "namespace_uploads".* FROM "namespace_uploads" WHERE "namespace_uploads"."id" BETWEEN 1 AND 10000 AND "namespace_uploads"."namespace_id" IN ( WITH RECURSIVE "base_and_descendants" AS ( ( SELECT "geo_node_namespace_links"."namespace_id" AS id FROM "geo_node_namespace_links" WHERE "geo_node_namespace_links"."geo_node_id" = 2) UNION ( SELECT "namespaces"."id" FROM "namespaces", "base_and_descendants" WHERE "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT "id" FROM "base_and_descendants" AS "namespaces"); -
Query Plan: https://explain.depesz.com/s/62pU
-
-
Selective Sync by Organizations:
-
Raw SQL
Click to expand
SELECT "namespace_uploads".* FROM "namespace_uploads" WHERE "namespace_uploads"."id" BETWEEN 1 AND 10000 AND "namespace_uploads"."namespace_id" IN ( SELECT "namespaces"."id" FROM "namespaces" WHERE "namespaces"."organization_id" IN ( SELECT "organizations"."id" FROM "organizations" INNER JOIN "geo_node_organization_links" ON "organizations"."id" = "geo_node_organization_links"."organization_id" WHERE "geo_node_organization_links"."geo_node_id" = 2)); -
Query Plan: https://explain.depesz.com/s/nyun
-
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.