Geo backend: Add selective sync by org settings
What does this MR do and why?
Note: This builds on top of !203936 (merged) (to extract refactors out of this large MR).
Implements the backend changes needed to choose selective sync by organizations and choose which organizations to sync.
For now, if you enable the FF and use selective sync by organizations, then nothing will sync.
Subsequent MRs will implement the scopes for each data type.
References
How to set up and validate locally
Prerequisites
Click to expand
-
Set up Geo with GDK
- Follow the GDK Geo setup guide to configure a primary and secondary Geo instance
- Ensure both instances are running properly
-
Enable organization features
- Run these Rails commands on your primary GDK instance in Rails console:
Feature.enable_percentage_of_time(:allow_organization_creation, 100) Feature.enable_percentage_of_time(:organization_switching, 100) Feature.enable_percentage_of_time(:ui_for_organizations, 100) -
Create test organizations
- Run these Rails commands to create test organizations with projects:
# Create first organization with owner
org1 = Organizations::Organization.create!(name: 'Test Org 1', path: 'test-org-1', visibility_level: Organizations::Organization::PUBLIC)
Organizations::OrganizationUser.create_organization_record_for(User.first.id, org1.id)
# Create second organization with owner
org2 = Organizations::Organization.create!(name: 'Test Org 2', path: 'test-org-2', visibility_level: Organizations::Organization::PUBLIC)
Organizations::OrganizationUser.create_organization_record_for(User.first.id, org2.id)
# Create projects in first organization
group1 = Group.create!(name: 'Group 1', path: 'group-1', organization: org1)
group1.add_owner(User.first)
# Create 3 projects in first organization
3.times do |i|
Projects::CreateService.new(User.first, {
name: "Project #{i+1}",
path: "project-#{i+1}",
description: "Test project #{i+1}",
namespace_id: group1.id,
organization_id: org1.id,
visibility_level: Gitlab::VisibilityLevel.level_value('private'),
initialize_with_readme: true
}).execute
end
# Create projects in second organization
group2 = Group.create!(name: 'Group 2', path: 'group-2', organization: org2)
group2.add_owner(User.first)
# Create 3 projects in second organization
3.times do |i|
Projects::CreateService.new(User.first, {
name: "Project #{i+4}",
path: "project-#{i+4}",
description: "Test project #{i+4}",
namespace_id: group2.id,
organization_id: org2.id,
visibility_level: Gitlab::VisibilityLevel.level_value('private'),
initialize_with_readme: true
}).execute
end
puts 'Created 2 organizations with 3 projects each'
-
Create a personal access token
- Follow the personal access token documentation
- Make sure to select the
apiandadmin_modescopes - Save the token for use in the API requests
Testing Steps
Click to expand
-
In the primary GDK site:
gdk switch mk/add-selective-sync-by-org-backend -
In the secondary GDK site:
gdk switch mk/add-selective-sync-by-org-backend -
In the primary GDK site, open Rails console:
bin/rails c -
Enable the FF:
Feature.enable(:geo_selective_sync_by_organizations) -
Get your current configuration:
# Get your personal access token export PRIVATE_TOKEN="your_personal_access_token" # List all Geo sites to get the site ID curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites" # Store the site ID of the secondary node export SITE_ID=2 # Replace with your secondary site ID # Output organization objects for their IDs bin/rails runner "pp Organizations::Organization.all" # Store an organization ID for testing export ORG_ID=1003 # Replace with your organization ID -
Enable selective sync by organization:
# Enable selective sync by organization and select the specific organization curl --request PUT \ --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "selective_sync_type": "organizations", "selective_sync_organization_ids": ['$ORG_ID'] }' \ "http://localhost:3000/api/v4/geo_sites/$SITE_ID" -
Verify the configuration:
# Get the updated Geo site configuration curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" # Confirm that selective_sync_type is "organizations" and organization_ids contains your organization ID # If you have `jq` installed, then pipe to it to format the output nicely curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq -
Test with multiple organizations:
export ORG_ID2=1004 # Replace with another organization ID # Update to include multiple organizations curl --request PUT \ --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "selective_sync_type": "organizations", "selective_sync_organization_ids": ['$ORG_ID','$ORG_ID2'] }' \ "http://localhost:3000/api/v4/geo_sites/$SITE_ID" # Verify the update curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" -
Disable selective sync:
# Reset back to no selective sync curl --request PUT \ --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \ --header "Content-Type: application/json" \ --data '{ "selective_sync_type": "" }' \ "http://localhost:3000/api/v4/geo_sites/$SITE_ID" # Verify the update curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID"
Expected Results
- The API should correctly store and return the organization IDs
- You should be able to update
selective_sync_typetoorganizations - You should be able to add/remove organizations from the selective sync list
- You should be able to disable selective sync entirely
- Changing selective sync type or disabling it doesn't clear
organization_ids. This happens withnamespace_ids too. If we need this, it should be in a follow up. - This MR only implements the backend for selecting organizations. When selective sync by organization is enabled, after Geo::RegistryConsistencyWorker processes everything, then all progress bars will say "Nothing to synchronize". The actual selective sync scopes will be implemented in subsequent MRs.
Click here to expand the above testing on my machine
❯ curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites" | jq
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1454 100 1454 0 0 13485 0 --:--:-- --:--:-- --:--:-- 13588
[
{
"id": 1,
"name": "gdk",
"url": "http://127.0.0.1:3001/",
"internal_url": "http://127.0.0.1:3000/",
"primary": true,
"enabled": true,
"current": true,
"files_max_capacity": 10,
"repos_max_capacity": 10,
"verification_max_capacity": 10,
"container_repositories_max_capacity": 2,
"selective_sync_type": null,
"selective_sync_shards": [],
"selective_sync_namespace_ids": [],
"selective_sync_organization_ids": [],
"minimum_reverification_interval": 90,
"web_edit_url": "http://127.0.0.1:3000/admin/geo/sites/1/edit",
"_links": {
"self": "http://127.0.0.1:3000/api/v4/geo_sites/1",
"status": "http://127.0.0.1:3000/api/v4/geo_sites/1/status",
"repair": "http://127.0.0.1:3000/api/v4/geo_sites/1/repair"
}
},
{
"id": 2,
"name": "gdk2",
"url": "http://127.0.0.1:3001/",
"internal_url": "http://127.0.0.1:3001/",
"primary": false,
"enabled": true,
"current": false,
"files_max_capacity": 10,
"repos_max_capacity": 10,
"verification_max_capacity": 10,
"container_repositories_max_capacity": 2,
"selective_sync_type": "",
"selective_sync_shards": [],
"selective_sync_namespace_ids": [],
"selective_sync_organization_ids": [],
"minimum_reverification_interval": 90,
"sync_object_storage": false,
"web_edit_url": "http://127.0.0.1:3000/admin/geo/sites/2/edit",
"web_geo_replication_details_url": "http://127.0.0.1:3001/admin/geo/sites/2/replication/ci_secure_files",
"_links": {
"self": "http://127.0.0.1:3000/api/v4/geo_sites/2",
"status": "http://127.0.0.1:3000/api/v4/geo_sites/2/status",
"repair": "http://127.0.0.1:3000/api/v4/geo_sites/2/repair"
}
}
]
❯ # Store the site ID of the secondary node
export SITE_ID=2 # Replace with your secondary site ID
# Output organization objects for their IDs
bin/rails runner "pp Organizations::Organization.all"
# Store an organization ID for testing
export ORG_ID=1003 # Replace with your organization ID
[#<Organizations::Organization id:1 path:default>, #<Organizations::Organization id:1003 path:test-org-1>, #<Organizations::Organization id:1004 path:test-org-2>]
❯ # Store an organization ID for testing
❯ export ORG_ID=1003 # Replace with your organization ID
❯
❯ # Enable selective sync by organization and select the specific organization
curl --request PUT \
--header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"selective_sync_type": "organizations",
"selective_sync_organization_ids": ['$ORG_ID']
}' \
"http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 904 100 809 100 95 1947 228 --:--:-- --:--:-- --:--:-- 2178
{
"id": 2,
"name": "gdk2",
"url": "http://127.0.0.1:3001/",
"internal_url": "http://127.0.0.1:3001/",
"primary": false,
"enabled": true,
"current": false,
"files_max_capacity": 10,
"repos_max_capacity": 10,
"verification_max_capacity": 10,
"container_repositories_max_capacity": 2,
"selective_sync_type": "organizations",
"selective_sync_shards": [],
"selective_sync_namespace_ids": [],
"selective_sync_organization_ids": [
1003
],
"minimum_reverification_interval": 90,
"sync_object_storage": false,
"web_edit_url": "http://127.0.0.1:3000/admin/geo/sites/2/edit",
"web_geo_replication_details_url": "http://127.0.0.1:3001/admin/geo/sites/2/replication/ci_secure_files",
"_links": {
"self": "http://127.0.0.1:3000/api/v4/geo_sites/2",
"status": "http://127.0.0.1:3000/api/v4/geo_sites/2/status",
"repair": "http://127.0.0.1:3000/api/v4/geo_sites/2/repair"
}
}
❯ # Confirm that selective_sync_type is "organizations" and organization_ids contains your organization ID
# If you have `jq` installed, then pipe to it to format the output nicely
curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 809 100 809 0 0 6433 0 --:--:-- --:--:-- --:--:-- 6472
{
"id": 2,
"name": "gdk2",
"url": "http://127.0.0.1:3001/",
"internal_url": "http://127.0.0.1:3001/",
"primary": false,
"enabled": true,
"current": false,
"files_max_capacity": 10,
"repos_max_capacity": 10,
"verification_max_capacity": 10,
"container_repositories_max_capacity": 2,
"selective_sync_type": "organizations",
"selective_sync_shards": [],
"selective_sync_namespace_ids": [],
"selective_sync_organization_ids": [
1003
],
"minimum_reverification_interval": 90,
"sync_object_storage": false,
"web_edit_url": "http://127.0.0.1:3000/admin/geo/sites/2/edit",
"web_geo_replication_details_url": "http://127.0.0.1:3001/admin/geo/sites/2/replication/ci_secure_files",
"_links": {
"self": "http://127.0.0.1:3000/api/v4/geo_sites/2",
"status": "http://127.0.0.1:3000/api/v4/geo_sites/2/status",
"repair": "http://127.0.0.1:3000/api/v4/geo_sites/2/repair"
}
}
❯ export ORG_ID2=1004 # Replace with another organization ID
# Update to include multiple organizations
curl --request PUT \
--header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"selective_sync_type": "organizations",
"selective_sync_organization_ids": ['$ORG_ID','$ORG_ID2']
}' \
"http://localhost:3000/api/v4/geo_sites/$SITE_ID"
# Verify the update
curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq
{"id":2,"name":"gdk2","url":"http://127.0.0.1:3001/","internal_url":"http://127.0.0.1:3001/","primary":false,"enabled":true,"current":false,"files_max_capacity":10,"repos_max_capacity":10,"verification_max_capacity":10,"container_repositories_max_capacity":2,"selective_sync_type":"organizations","selective_sync_shards":[],"selective_sync_namespace_ids":[],"selective_sync_organization_ids":[1003,1004],"minimum_reverification_interval":90,"sync_object_storage":false,"web_edit_url":"http://127.0.0.1:3000/admin/geo/sites/2/edit","web_geo_replication_details_url":"http://127.0.0.1:3001/admin/geo/sites/2/replication/ci_secure_files","_links":{"self":"http://127.0.0.1:3000/api/v4/geo_sites/2","status":"http://127.0.0.1:3000/api/v4/geo_sites/2/status","repair":"http://127.0.0.1:3000/api/v4/geo_sites/2/repair"}} % Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 814 100 814 0 0 8445 0 --:--:-- --:--:-- --:--:-- 8479
{
"id": 2,
"name": "gdk2",
"url": "http://127.0.0.1:3001/",
"internal_url": "http://127.0.0.1:3001/",
"primary": false,
"enabled": true,
"current": false,
"files_max_capacity": 10,
"repos_max_capacity": 10,
"verification_max_capacity": 10,
"container_repositories_max_capacity": 2,
"selective_sync_type": "organizations",
"selective_sync_shards": [],
"selective_sync_namespace_ids": [],
"selective_sync_organization_ids": [
1003,
1004
],
"minimum_reverification_interval": 90,
"sync_object_storage": false,
"web_edit_url": "http://127.0.0.1:3000/admin/geo/sites/2/edit",
"web_geo_replication_details_url": "http://127.0.0.1:3001/admin/geo/sites/2/replication/ci_secure_files",
"_links": {
"self": "http://127.0.0.1:3000/api/v4/geo_sites/2",
"status": "http://127.0.0.1:3000/api/v4/geo_sites/2/status",
"repair": "http://127.0.0.1:3000/api/v4/geo_sites/2/repair"
}
}
❯ # Reset back to no selective sync
curl --request PUT \
--header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"selective_sync_type": ""
}' \
"http://localhost:3000/api/v4/geo_sites/$SITE_ID"
# Verify the update
curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq
{"id":2,"name":"gdk2","url":"http://127.0.0.1:3001/","internal_url":"http://127.0.0.1:3001/","primary":false,"enabled":true,"current":false,"files_max_capacity":10,"repos_max_capacity":10,"verification_max_capacity":10,"container_repositories_max_capacity":2,"selective_sync_type":"","selective_sync_shards":[],"selective_sync_namespace_ids":[],"selective_sync_organization_ids":[1003,1004],"minimum_reverification_interval":90,"sync_object_storage":false,"web_edit_url":"http://127.0.0.1:3000/admin/geo/sites/2/edit","web_geo_replication_details_url":"http://127.0.0.1:3001/admin/geo/sites/2/replication/ci_secure_files","_links":{"self":"http://127.0.0.1:3000/api/v4/geo_sites/2","status":"http://127.0.0.1:3000/api/v4/geo_sites/2/status","repair":"http://127.0.0.1:3000/api/v4/geo_sites/2/repair"}} % Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 801 100 801 0 0 15996 0 --:--:-- --:--:-- --:--:-- 16020
{
"id": 2,
"name": "gdk2",
"url": "http://127.0.0.1:3001/",
"internal_url": "http://127.0.0.1:3001/",
"primary": false,
"enabled": true,
"current": false,
"files_max_capacity": 10,
"repos_max_capacity": 10,
"verification_max_capacity": 10,
"container_repositories_max_capacity": 2,
"selective_sync_type": "",
"selective_sync_shards": [],
"selective_sync_namespace_ids": [],
"selective_sync_organization_ids": [
1003,
1004
],
"minimum_reverification_interval": 90,
"sync_object_storage": false,
"web_edit_url": "http://127.0.0.1:3000/admin/geo/sites/2/edit",
"web_geo_replication_details_url": "http://127.0.0.1:3001/admin/geo/sites/2/replication/ci_secure_files",
"_links": {
"self": "http://127.0.0.1:3000/api/v4/geo_sites/2",
"status": "http://127.0.0.1:3000/api/v4/geo_sites/2/status",
"repair": "http://127.0.0.1:3000/api/v4/geo_sites/2/repair"
}
}
❯ curl --request PUT \
--header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"selective_sync_type": "organizations",
"selective_sync_organization_ids": ['$ORG_ID']
}' \
"http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 904 100 809 100 95 5306 623 --:--:-- --:--:-- --:--:-- 5908
{
"id": 2,
"name": "gdk2",
"url": "http://127.0.0.1:3001/",
"internal_url": "http://127.0.0.1:3001/",
"primary": false,
"enabled": true,
"current": false,
"files_max_capacity": 10,
"repos_max_capacity": 10,
"verification_max_capacity": 10,
"container_repositories_max_capacity": 2,
"selective_sync_type": "organizations",
"selective_sync_shards": [],
"selective_sync_namespace_ids": [],
"selective_sync_organization_ids": [
1003
],
"minimum_reverification_interval": 90,
"sync_object_storage": false,
"web_edit_url": "http://127.0.0.1:3000/admin/geo/sites/2/edit",
"web_geo_replication_details_url": "http://127.0.0.1:3001/admin/geo/sites/2/replication/ci_secure_files",
"_links": {
"self": "http://127.0.0.1:3000/api/v4/geo_sites/2",
"status": "http://127.0.0.1:3000/api/v4/geo_sites/2/status",
"repair": "http://127.0.0.1:3000/api/v4/geo_sites/2/repair"
}
}
❯ curl --request PUT \
--header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"selective_sync_type": "organizations",
"selective_sync_organization_ids": []
}' \
"http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 896 100 805 100 91 8953 1012 --:--:-- --:--:-- --:--:-- 10067
{
"id": 2,
"name": "gdk2",
"url": "http://127.0.0.1:3001/",
"internal_url": "http://127.0.0.1:3001/",
"primary": false,
"enabled": true,
"current": false,
"files_max_capacity": 10,
"repos_max_capacity": 10,
"verification_max_capacity": 10,
"container_repositories_max_capacity": 2,
"selective_sync_type": "organizations",
"selective_sync_shards": [],
"selective_sync_namespace_ids": [],
"selective_sync_organization_ids": [],
"minimum_reverification_interval": 90,
"sync_object_storage": false,
"web_edit_url": "http://127.0.0.1:3000/admin/geo/sites/2/edit",
"web_geo_replication_details_url": "http://127.0.0.1:3001/admin/geo/sites/2/replication/ci_secure_files",
"_links": {
"self": "http://127.0.0.1:3000/api/v4/geo_sites/2",
"status": "http://127.0.0.1:3000/api/v4/geo_sites/2/status",
"repair": "http://127.0.0.1:3000/api/v4/geo_sites/2/repair"
}
}
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.