Geo: Allow selective sync by organizations for Group Wikis
What does this MR do and why?
This code change implements support for organization-based selective synchronization in a geo-replication system. Previously, the system had a placeholder that returned no data when trying to sync by organizations. Now it properly retrieves all namespaces (data containers) that belong to selected organizations and includes their child namespaces. Additionally, the group wiki repository synchronization logic was updated to handle organization-based selection the same way it handles namespace-based selection, ensuring that wiki repositories are properly synced when organizations are selected for replication.
References
- Related to #534201 (closed)
How to set up and validate locally
Prerequisites
Click to expand
-
Set up Geo with GDK
- Follow the GDK Geo setup guide to configure a primary and secondary Geo instance
- Ensure both instances are running properly
-
Enable organization features
-
Run these Rails commands on your primary GDK instance in Rails console:
Feature.enable_percentage_of_time(:allow_organization_creation, 100) Feature.enable_percentage_of_time(:organization_switching, 100) Feature.enable_percentage_of_time(:ui_for_organizations, 100)
-
-
Create test organizations
-
Run these Rails commands to create test organizations with projects:
# Create first organization with owner org1 = Organizations::Organization.create!(name: 'Test Org 1', path: 'test-org-1', visibility_level: Organizations::Organization::PUBLIC) Organizations::OrganizationUser.create_organization_record_for(User.first.id, org1.id) # Create second organization with owner org2 = Organizations::Organization.create!(name: 'Test Org 2', path: 'test-org-2', visibility_level: Organizations::Organization::PUBLIC) Organizations::OrganizationUser.create_organization_record_for(User.first.id, org2.id) # Create 3 group wikis in first organization 3.times do |i| group = Group.create!(name: "Group #{i+1}", "org-1-group-#{i+1}", organization: org1) group.add_owner(User.first) group.create_wiki end # Create 3 group wikis in second organization 3.times do |i| group = Group.create!(name: "Group #{i+1}", "org-2-group-#{i+1}", organization: org2) group.add_owner(User.first) group.create_wiki end puts 'Created 2 organizations with 3 groups wiki repositories each'
-
-
Create a personal access token
- Follow the personal access token documentation
- Make sure to select the
apiandadmin_modescopes - Save the token for use in the API requests
Primary Site Selective Checksumming by Organizations - Testing Steps
Click to expand
-
In the primary GDK site:
gdk switch 534201-org-mover-implement-selective-sync-scope-for-project-repository -
In the secondary GDK site:
gdk switch 534201-org-mover-implement-selective-sync-scope-for-project-repository -
In the primary GDK site, open Rails console:
bin/rails c -
Enable the FF:
Feature.enable(:org_mover_extend_selective_sync_to_primary_checksumming) -
Enable the FF:
Feature.enable(:geo_selective_sync_by_organizations) -
Get your current configuration:
# Get your personal access token export PRIVATE_TOKEN="your_personal_access_token" # List all Geo sites to get the site ID curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites" | jq # Store the site ID of the primary node export SITE_ID=1 # Replace with your primary site ID # Output organization objects for their IDs bin/rails runner "pp Organizations::Organization.all" # Store an organization ID for testing export ORG_ID=1003 # Replace with your organization ID -
Enable selective checksumming by organization:
# Enable selective checksumming by organization and select the specific organization curl --request PUT \ --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "selective_sync_type": "organizations", "selective_sync_organization_ids": ['$ORG_ID'] }' \ "http://localhost:3000/api/v4/geo_sites/$SITE_ID" -
Verify the configuration:
# Get the updated Geo site configuration and confirm that selective_sync_type is "organizations" and # organization_ids contains your organization ID curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq -
Wait a few minutes and verify the secondary site status
# Get the updated site status and confirm that group_wiki_repositories_checksummed_count # matches the number of group wiki repositories that belong to your organization ID curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID/status" | jq -
Test with multiple organizations:
export ORG_ID2=1004 # Replace with another organization ID # Update to include multiple organizations curl --request PUT \ --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "selective_sync_type": "organizations", "selective_sync_organization_ids": ['$ORG_ID','$ORG_ID2'] }' \ "http://localhost:3000/api/v4/geo_sites/$SITE_ID" # Verify the update curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq # Wait a few minutes and verify the Geo site status curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID/status" | jq -
Disable selective sync:
# Reset back to no selective sync curl --request PUT \ --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "selective_sync_type": "" }' \ "http://localhost:3000/api/v4/geo_sites/$SITE_ID" # Verify the update curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" # Wait a few minutes and verify the Geo site status curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID/status" | jq # In the primary GDK site, disable the FF: bin/rails runner "pp Feature.disable(:geo_selective_sync_by_organizations)"
Secondary Site Selective Sync by Organizations - Testing Steps
Click to expand
-
In the primary GDK site:
gdk switch 534201-org-mover-implement-selective-sync-scope-for-project-repository -
In the secondary GDK site:
gdk switch 534201-org-mover-implement-selective-sync-scope-for-project-repository -
In the primary GDK site, open Rails console:
bin/rails c -
Enable the FF:
Feature.enable(:geo_selective_sync_by_organizations) -
Get your current configuration:
# Get your personal access token export PRIVATE_TOKEN="your_personal_access_token" # List all Geo sites to get the site ID curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites" | jq # Store the site ID of the secondary node export SITE_ID=2 # Replace with your secondary site ID # Output organization objects for their IDs bin/rails runner "pp Organizations::Organization.all" # Store an organization ID for testing export ORG_ID=1003 # Replace with your organization ID -
Enable selective sync by organization:
# Enable selective sync by organization and select the specific organization curl --request PUT \ --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "selective_sync_type": "organizations", "selective_sync_organization_ids": ['$ORG_ID'] }' \ "http://localhost:3000/api/v4/geo_sites/$SITE_ID" -
Verify the configuration:
# Get the updated Geo site configuration and confirm that selective_sync_type is "organizations" and # organization_ids contains your organization ID curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq -
Wait a few minutes and verify the secondary site status
# Get the updated site status and confirm that group_wiki_repositories_checksummed_count # matches the number of group wiki repositories that belong to your organization ID curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID/status" | jq -
Test with multiple organizations:
export ORG_ID2=1004 # Replace with another organization ID # Update to include multiple organizations curl --request PUT \ --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "selective_sync_type": "organizations", "selective_sync_organization_ids": ['$ORG_ID','$ORG_ID2'] }' \ "http://localhost:3000/api/v4/geo_sites/$SITE_ID" # Verify the update curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq # Wait a few minutes and verify the Geo site status curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID/status" | jq -
Disable selective sync:
# Reset back to no selective sync curl --request PUT \ --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "selective_sync_type": "" }' \ "http://localhost:3000/api/v4/geo_sites/$SITE_ID" # Verify the update curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" # Wait a few minutes and verify the Geo site status curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID/status" | jq # In the primary GDK site, disable the FF: bin/rails runner "pp Feature.disable(:org_mover_extend_selective_sync_to_primary_checksumming)" bin/rails runner "pp Feature.disable(:geo_selective_sync_by_organizations)"
Database Queries
-
GroupWikiRepository.replicables_for_current_secondary(1..10000)-
Raw SQL
Click to expand
SELECT "group_wiki_repositories".* FROM "group_wiki_repositories" INNER JOIN "namespaces" ON "namespaces"."id" = "group_wiki_repositories"."group_id" AND "namespaces"."type" = 'Group' WHERE "group_wiki_repositories"."group_id" IN ( SELECT "namespaces"."id" FROM "namespaces" WHERE "namespaces"."organization_id" IN ( SELECT "geo_node_organization_links"."organization_id" FROM "geo_node_organization_links" WHERE "geo_node_organization_links"."geo_node_id" = 2)) AND "group_wiki_repositories"."group_id" BETWEEN 1 AND 10000; -
Query Plan: https://explain.depesz.com/s/bcsj
-
-
GroupWikiRepository.pluck_verifiable_ids_in_range(1..10000)-
Raw SQL
Click to expand
SELECT "group_wiki_repositories"."group_id" FROM "group_wiki_repositories" INNER JOIN "namespaces" ON "namespaces"."id" = "group_wiki_repositories"."group_id" AND "namespaces"."type" = 'Group' WHERE "group_wiki_repositories"."group_id" IN ( SELECT "namespaces"."id" FROM "namespaces" WHERE "namespaces"."organization_id" IN ( SELECT "geo_node_organization_links"."organization_id" FROM "geo_node_organization_links" WHERE "geo_node_organization_links"."geo_node_id" = 2)) AND "group_wiki_repositories"."group_id" BETWEEN 1 AND 10000; -
Query Plan: https://explain.depesz.com/s/Xrlx
-
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.