Geo - Allow selectivey sync by orgs for dependency proxy manifests
What does this MR do and why?
Previously, the system could only sync all manifests or filter by specific namespaces/groups. Now it properly supports filtering by organizations, giving administrators more flexible options for what data gets synced between sites.
References
- Related to #534194
How to set up and validate locally
Prerequisites
Click to expand
-
Set up Geo with GDK
- Follow the GDK Geo setup guide to configure a primary and secondary Geo instance
- Ensure both instances are running properly
-
Enable organization features
-
Run these Rails commands on your primary GDK instance in Rails console:
Feature.enable_percentage_of_time(:allow_organization_creation, 100) Feature.enable_percentage_of_time(:organization_switching, 100) Feature.enable_percentage_of_time(:ui_for_organizations, 100)
-
-
Create test organizations
-
Run these Rails commands to create test organizations with projects:
# Create first organization with owner org1 = Organizations::Organization.create!(name: 'Test Org 1', path: 'test-org-1', visibility_level: Organizations::Organization::PUBLIC) Organizations::OrganizationUser.create_organization_record_for(User.first.id, org1.id) # Create second organization with owner org2 = Organizations::Organization.create!(name: 'Test Org 2', path: 'test-org-2', visibility_level: Organizations::Organization::PUBLIC) Organizations::OrganizationUser.create_organization_record_for(User.first.id, org2.id) # Create design management manifests in first organization group1 = Group.create!(name: 'Group 1', path: 'group-1', organization: org1) group1.add_owner(User.first) # Create 3 design management manifests in first organization 3.times do |i| manifest = DependencyProxy::Manifest.new( group: group1, size: 1234, digest: 'sha256:d0710affa17fad5f466a70159cc458227bd25d4afb39514ef662ead3e6c99515', file_name: "alpine:latest#{SecureRandom.hex(4)}.json", content_type: 'application/vnd.docker.distribution.manifest.v2+json', status: :default ) manifest.file = Rack::Test::UploadedFile.new('spec/fixtures/dependency_proxy/manifest') manifest.save! end # Create design management manifests in second organization group2 = Group.create!(name: 'Group 2', path: 'group-2', organization: org2) group2.add_owner(User.first) # Create 3 design management manifests in second organization 3.times do |i| manifest = DependencyProxy::Manifest.new( group: group2, size: 1234, digest: 'sha256:d0710affa17fad5f466a70159cc458227bd25d4afb39514ef662ead3e6c99515', file_name: "alpine:latest#{SecureRandom.hex(4)}.json", content_type: 'application/vnd.docker.distribution.manifest.v2+json', status: :default ) manifest.file = Rack::Test::UploadedFile.new('spec/fixtures/dependency_proxy/manifest') manifest.save! end puts 'Created 2 organizations with 3 design management manifests each'
-
-
Create a personal access token
- Follow the personal access token documentation
- Make sure to select the
apiandadmin_modescopes - Save the token for use in the API requests
Primary Site Selective Checksumming by Organizations - Testing Steps
Click to expand
-
In the primary GDK site:
gdk switch 534194-org-mover-implement-selective-sync-scope-for-dependencyproxy-manifest -
In the secondary GDK site:
gdk switch 534194-org-mover-implement-selective-sync-scope-for-dependencyproxy-manifest -
In the primary GDK site, open Rails console:
bin/rails c -
Enable the FF:
Feature.enable(:org_mover_extend_selective_sync_to_primary_checksumming) -
Enable the FF:
Feature.enable(:geo_selective_sync_by_organizations) -
Get your current configuration:
# Get your personal access token export PRIVATE_TOKEN="your_personal_access_token" # List all Geo sites to get the site ID curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites" | jq # Store the site ID of the primary node export SITE_ID=1 # Replace with your primary site ID # Output organization objects for their IDs bin/rails runner "pp Organizations::Organization.all" # Store an organization ID for testing export ORG_ID=1003 # Replace with your organization ID -
Enable selective checksumming by organization:
# Enable selective checksumming by organization and select the specific organization curl --request PUT \ --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "selective_sync_type": "organizations", "selective_sync_organization_ids": ['$ORG_ID'] }' \ "http://localhost:3000/api/v4/geo_sites/$SITE_ID" -
Verify the configuration:
# Get the updated Geo site configuration and confirm that selective_sync_type is "organizations" and # organization_ids contains your organization ID curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq -
Wait a few minutes and verify the secondary site status
# Get the updated site status and confirm that dependency_proxy_manifests_checksummed_count # matches the number of dependency proxy manifests that belong to your organization ID curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID/status" | jq -
Test with multiple organizations:
export ORG_ID2=1004 # Replace with another organization ID # Update to include multiple organizations curl --request PUT \ --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "selective_sync_type": "organizations", "selective_sync_organization_ids": ['$ORG_ID','$ORG_ID2'] }' \ "http://localhost:3000/api/v4/geo_sites/$SITE_ID" # Verify the update curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq # Wait a few minutes and verify the Geo site status curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID/status" | jq -
Disable selective sync:
# Reset back to no selective sync curl --request PUT \ --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "selective_sync_type": "" }' \ "http://localhost:3000/api/v4/geo_sites/$SITE_ID" # Verify the update curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" # Wait a few minutes and verify the Geo site status curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID/status" | jq # In the primary GDK site, disable the FF: bin/rails runner "pp Feature.disable(:geo_selective_sync_by_organizations)"
Secondary Site Selective Sync by Organizations - Testing Steps
Click to expand
-
In the primary GDK site:
gdk switch 534194-org-mover-implement-selective-sync-scope-for-dependencyproxy-manifest -
In the secondary GDK site:
gdk switch 534194-org-mover-implement-selective-sync-scope-for-dependencyproxy-manifest -
In the primary GDK site, open Rails console:
bin/rails c -
Enable the FF:
Feature.enable(:geo_selective_sync_by_organizations) -
Get your current configuration:
# Get your personal access token export PRIVATE_TOKEN="your_personal_access_token" # List all Geo sites to get the site ID curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites" | jq # Store the site ID of the secondary node export SITE_ID=2 # Replace with your secondary site ID # Output organization objects for their IDs bin/rails runner "pp Organizations::Organization.all" # Store an organization ID for testing export ORG_ID=1003 # Replace with your organization ID -
Enable selective sync by organization:
# Enable selective sync by organization and select the specific organization curl --request PUT \ --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "selective_sync_type": "organizations", "selective_sync_organization_ids": ['$ORG_ID'] }' \ "http://localhost:3000/api/v4/geo_sites/$SITE_ID" -
Verify the configuration:
# Get the updated Geo site configuration and confirm that selective_sync_type is "organizations" and # organization_ids contains your organization ID curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq -
Wait a few minutes and verify the secondary site status
# Get the updated site status and confirm that dependency_proxy_manifests_checksummed_count # matches the number of dependency proxy manifests that belong to your organization ID curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID/status" | jq -
Test with multiple organizations:
export ORG_ID2=1004 # Replace with another organization ID # Update to include multiple organizations curl --request PUT \ --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "selective_sync_type": "organizations", "selective_sync_organization_ids": ['$ORG_ID','$ORG_ID2'] }' \ "http://localhost:3000/api/v4/geo_sites/$SITE_ID" # Verify the update curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq # Wait a few minutes and verify the Geo site status curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID/status" | jq -
Disable selective sync:
# Reset back to no selective sync curl --request PUT \ --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "selective_sync_type": "" }' \ "http://localhost:3000/api/v4/geo_sites/$SITE_ID" # Verify the update curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" # Wait a few minutes and verify the Geo site status curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID/status" | jq # In the primary GDK site, disable the FF: bin/rails runner "pp Feature.disable(:org_mover_extend_selective_sync_to_primary_checksumming)" bin/rails runner "pp Feature.disable(:geo_selective_sync_by_organizations)"
Database Queries
-
DependencyProxy::Manifest.replicables_for_current_secondary(1..10000)-
Raw SQL
Click to expand
SELECT "dependency_proxy_manifests".* FROM "dependency_proxy_manifests" INNER JOIN "namespaces" ON "namespaces"."id" = "dependency_proxy_manifests"."group_id" AND "namespaces"."type" = 'Group' WHERE "dependency_proxy_manifests"."file_store" = 1 AND "namespaces"."id" IN ( SELECT "namespaces"."id" FROM "namespaces" WHERE "namespaces"."organization_id" IN ( SELECT "geo_node_organization_links"."organization_id" FROM "geo_node_organization_links" WHERE "geo_node_organization_links"."geo_node_id" = 2)) AND "dependency_proxy_manifests"."id" BETWEEN 1 AND 10000; -
Query Plan: https://explain.depesz.com/s/3uWP
-
-
DependencyProxy::Manifest.pluck_verifiable_ids_in_range(1..10000)-
Raw SQL
Click to expand
SELECT "dependency_proxy_manifests"."id" FROM "dependency_proxy_manifests" INNER JOIN "namespaces" ON "namespaces"."id" = "dependency_proxy_manifests"."group_id" AND "namespaces"."type" = 'Group' WHERE "dependency_proxy_manifests"."file_store" = 1 AND "namespaces"."id" IN ( SELECT "namespaces"."id" FROM "namespaces" WHERE "namespaces"."organization_id" IN ( SELECT "geo_node_organization_links"."organization_id" FROM "geo_node_organization_links" WHERE "geo_node_organization_links"."geo_node_id" = 2)) AND "dependency_proxy_manifests"."id" BETWEEN 1 AND 10000; -
Query Plan: https://explain.depesz.com/s/VemB
-
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.