Skip to content

Geo backend: Add selective sync by org settings

What does this MR do and why?

Note: This builds on top of !203936 (merged) (to extract refactors out of this large MR).

Implements the backend changes needed to choose selective sync by organizations and choose which organizations to sync.

For now, if you enable the FF and use selective sync by organizations, then nothing will sync.

Subsequent MRs will implement the scopes for each data type.

References

#514251 (closed)

How to set up and validate locally

Prerequisites

Click to expand
  1. Set up Geo with GDK

    • Follow the GDK Geo setup guide to configure a primary and secondary Geo instance
    • Ensure both instances are running properly
  2. Enable organization features

    • Run these Rails commands on your primary GDK instance in Rails console:
    Feature.enable_percentage_of_time(:allow_organization_creation, 100)
    Feature.enable_percentage_of_time(:organization_switching, 100)
    Feature.enable_percentage_of_time(:ui_for_organizations, 100)
  3. Create test organizations

    • Run these Rails commands to create test organizations with projects:
# Create first organization with owner
org1 = Organizations::Organization.create!(name: 'Test Org 1', path: 'test-org-1', visibility_level: Organizations::Organization::PUBLIC)
Organizations::OrganizationUser.create_organization_record_for(User.first.id, org1.id)

# Create second organization with owner
org2 = Organizations::Organization.create!(name: 'Test Org 2', path: 'test-org-2', visibility_level: Organizations::Organization::PUBLIC)
Organizations::OrganizationUser.create_organization_record_for(User.first.id, org2.id)

# Create projects in first organization
group1 = Group.create!(name: 'Group 1', path: 'group-1', organization: org1)
group1.add_owner(User.first)

# Create 3 projects in first organization
3.times do |i|
Projects::CreateService.new(User.first, {
   name: "Project #{i+1}",
   path: "project-#{i+1}",
   description: "Test project #{i+1}",
   namespace_id: group1.id,
   organization_id: org1.id,
   visibility_level: Gitlab::VisibilityLevel.level_value('private'),
   initialize_with_readme: true
}).execute
end

# Create projects in second organization
group2 = Group.create!(name: 'Group 2', path: 'group-2', organization: org2)
group2.add_owner(User.first)

# Create 3 projects in second organization
3.times do |i|
Projects::CreateService.new(User.first, {
   name: "Project #{i+4}",
   path: "project-#{i+4}",
   description: "Test project #{i+4}",
   namespace_id: group2.id,
   organization_id: org2.id,
   visibility_level: Gitlab::VisibilityLevel.level_value('private'),
   initialize_with_readme: true
}).execute
end

puts 'Created 2 organizations with 3 projects each'
  1. Create a personal access token

Testing Steps

Click to expand
  1. In the primary GDK site: gdk switch mk/add-selective-sync-by-org-backend

  2. In the secondary GDK site: gdk switch mk/add-selective-sync-by-org-backend

  3. In the primary GDK site, open Rails console: bin/rails c

  4. Enable the FF: Feature.enable(:geo_selective_sync_by_organizations)

  5. Get your current configuration:

    # Get your personal access token
    export PRIVATE_TOKEN="your_personal_access_token"
    
    # List all Geo sites to get the site ID
    curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites"
    
    # Store the site ID of the secondary node
    export SITE_ID=2  # Replace with your secondary site ID
    
    # Output organization objects for their IDs
    bin/rails runner "pp Organizations::Organization.all"
    
    # Store an organization ID for testing
    export ORG_ID=1003  # Replace with your organization ID
  6. Enable selective sync by organization:

    # Enable selective sync by organization and select the specific organization
    curl --request PUT \
      --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \
      --header "Content-Type: application/json" \
      --data '{
        "selective_sync_type": "organizations",
        "selective_sync_organization_ids": ['$ORG_ID']
      }' \
      "http://localhost:3000/api/v4/geo_sites/$SITE_ID"
  7. Verify the configuration:

    # Get the updated Geo site configuration
    curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID"
    
    # Confirm that selective_sync_type is "organizations" and organization_ids contains your organization ID
    # If you have `jq` installed, then pipe to it to format the output nicely
    curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq
  8. Test with multiple organizations:

    export ORG_ID2=1004  # Replace with another organization ID
    
    # Update to include multiple organizations
    curl --request PUT \
      --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \
      --header "Content-Type: application/json" \
      --data '{
        "selective_sync_type": "organizations",
        "selective_sync_organization_ids": ['$ORG_ID','$ORG_ID2']
      }' \
      "http://localhost:3000/api/v4/geo_sites/$SITE_ID"
    
    # Verify the update
    curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID"
  9. Disable selective sync:

    # Reset back to no selective sync
    curl --request PUT \
      --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \
      --header "Content-Type: application/json" \
      --data '{
        "selective_sync_type": ""
      }' \
      "http://localhost:3000/api/v4/geo_sites/$SITE_ID"
    
    # Verify the update
    curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID"

Expected Results

  • The API should correctly store and return the organization IDs
  • You should be able to update selective_sync_type to organizations
  • You should be able to add/remove organizations from the selective sync list
  • You should be able to disable selective sync entirely
  • Changing selective sync type or disabling it doesn't clear organization_ids. This happens with namespace_ids too. If we need this, it should be in a follow up.
  • This MR only implements the backend for selecting organizations. When selective sync by organization is enabled, after Geo::RegistryConsistencyWorker processes everything, then all progress bars will say "Nothing to synchronize". The actual selective sync scopes will be implemented in subsequent MRs.
Click here to expand the above testing on my machine
❯ curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites" | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1454  100  1454    0     0  13485      0 --:--:-- --:--:-- --:--:-- 13588
[
  {
    "id": 1,
    "name": "gdk",
    "url": "http://127.0.0.1:3001/",
    "internal_url": "http://127.0.0.1:3000/",
    "primary": true,
    "enabled": true,
    "current": true,
    "files_max_capacity": 10,
    "repos_max_capacity": 10,
    "verification_max_capacity": 10,
    "container_repositories_max_capacity": 2,
    "selective_sync_type": null,
    "selective_sync_shards": [],
    "selective_sync_namespace_ids": [],
    "selective_sync_organization_ids": [],
    "minimum_reverification_interval": 90,
    "web_edit_url": "http://127.0.0.1:3000/admin/geo/sites/1/edit",
    "_links": {
      "self": "http://127.0.0.1:3000/api/v4/geo_sites/1",
      "status": "http://127.0.0.1:3000/api/v4/geo_sites/1/status",
      "repair": "http://127.0.0.1:3000/api/v4/geo_sites/1/repair"
    }
  },
  {
    "id": 2,
    "name": "gdk2",
    "url": "http://127.0.0.1:3001/",
    "internal_url": "http://127.0.0.1:3001/",
    "primary": false,
    "enabled": true,
    "current": false,
    "files_max_capacity": 10,
    "repos_max_capacity": 10,
    "verification_max_capacity": 10,
    "container_repositories_max_capacity": 2,
    "selective_sync_type": "",
    "selective_sync_shards": [],
    "selective_sync_namespace_ids": [],
    "selective_sync_organization_ids": [],
    "minimum_reverification_interval": 90,
    "sync_object_storage": false,
    "web_edit_url": "http://127.0.0.1:3000/admin/geo/sites/2/edit",
    "web_geo_replication_details_url": "http://127.0.0.1:3001/admin/geo/sites/2/replication/ci_secure_files",
    "_links": {
      "self": "http://127.0.0.1:3000/api/v4/geo_sites/2",
      "status": "http://127.0.0.1:3000/api/v4/geo_sites/2/status",
      "repair": "http://127.0.0.1:3000/api/v4/geo_sites/2/repair"
    }
  }
]
❯ # Store the site ID of the secondary node
export SITE_ID=2  # Replace with your secondary site ID

# Output organization objects for their IDs
bin/rails runner "pp Organizations::Organization.all"

# Store an organization ID for testing
export ORG_ID=1003  # Replace with your organization ID

[#<Organizations::Organization id:1 path:default>, #<Organizations::Organization id:1003 path:test-org-1>, #<Organizations::Organization id:1004 path:test-org-2>]
❯ # Store an organization ID for testing
❯ export ORG_ID=1003  # Replace with your organization ID

❯ # Enable selective sync by organization and select the specific organization
curl --request PUT \
  --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
    "selective_sync_type": "organizations",
    "selective_sync_organization_ids": ['$ORG_ID']
  }' \
  "http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   904  100   809  100    95   1947    228 --:--:-- --:--:-- --:--:--  2178
{
  "id": 2,
  "name": "gdk2",
  "url": "http://127.0.0.1:3001/",
  "internal_url": "http://127.0.0.1:3001/",
  "primary": false,
  "enabled": true,
  "current": false,
  "files_max_capacity": 10,
  "repos_max_capacity": 10,
  "verification_max_capacity": 10,
  "container_repositories_max_capacity": 2,
  "selective_sync_type": "organizations",
  "selective_sync_shards": [],
  "selective_sync_namespace_ids": [],
  "selective_sync_organization_ids": [
    1003
  ],
  "minimum_reverification_interval": 90,
  "sync_object_storage": false,
  "web_edit_url": "http://127.0.0.1:3000/admin/geo/sites/2/edit",
  "web_geo_replication_details_url": "http://127.0.0.1:3001/admin/geo/sites/2/replication/ci_secure_files",
  "_links": {
    "self": "http://127.0.0.1:3000/api/v4/geo_sites/2",
    "status": "http://127.0.0.1:3000/api/v4/geo_sites/2/status",
    "repair": "http://127.0.0.1:3000/api/v4/geo_sites/2/repair"
  }
}
❯ # Confirm that selective_sync_type is "organizations" and organization_ids contains your organization ID
# If you have `jq` installed, then pipe to it to format the output nicely
curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   809  100   809    0     0   6433      0 --:--:-- --:--:-- --:--:--  6472
{
  "id": 2,
  "name": "gdk2",
  "url": "http://127.0.0.1:3001/",
  "internal_url": "http://127.0.0.1:3001/",
  "primary": false,
  "enabled": true,
  "current": false,
  "files_max_capacity": 10,
  "repos_max_capacity": 10,
  "verification_max_capacity": 10,
  "container_repositories_max_capacity": 2,
  "selective_sync_type": "organizations",
  "selective_sync_shards": [],
  "selective_sync_namespace_ids": [],
  "selective_sync_organization_ids": [
    1003
  ],
  "minimum_reverification_interval": 90,
  "sync_object_storage": false,
  "web_edit_url": "http://127.0.0.1:3000/admin/geo/sites/2/edit",
  "web_geo_replication_details_url": "http://127.0.0.1:3001/admin/geo/sites/2/replication/ci_secure_files",
  "_links": {
    "self": "http://127.0.0.1:3000/api/v4/geo_sites/2",
    "status": "http://127.0.0.1:3000/api/v4/geo_sites/2/status",
    "repair": "http://127.0.0.1:3000/api/v4/geo_sites/2/repair"
  }
}
❯ export ORG_ID2=1004  # Replace with another organization ID

# Update to include multiple organizations
curl --request PUT \
  --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
    "selective_sync_type": "organizations",
    "selective_sync_organization_ids": ['$ORG_ID','$ORG_ID2']
  }' \
  "http://localhost:3000/api/v4/geo_sites/$SITE_ID"

# Verify the update
curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq
{"id":2,"name":"gdk2","url":"http://127.0.0.1:3001/","internal_url":"http://127.0.0.1:3001/","primary":false,"enabled":true,"current":false,"files_max_capacity":10,"repos_max_capacity":10,"verification_max_capacity":10,"container_repositories_max_capacity":2,"selective_sync_type":"organizations","selective_sync_shards":[],"selective_sync_namespace_ids":[],"selective_sync_organization_ids":[1003,1004],"minimum_reverification_interval":90,"sync_object_storage":false,"web_edit_url":"http://127.0.0.1:3000/admin/geo/sites/2/edit","web_geo_replication_details_url":"http://127.0.0.1:3001/admin/geo/sites/2/replication/ci_secure_files","_links":{"self":"http://127.0.0.1:3000/api/v4/geo_sites/2","status":"http://127.0.0.1:3000/api/v4/geo_sites/2/status","repair":"http://127.0.0.1:3000/api/v4/geo_sites/2/repair"}}  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   814  100   814    0     0   8445      0 --:--:-- --:--:-- --:--:--  8479
{
  "id": 2,
  "name": "gdk2",
  "url": "http://127.0.0.1:3001/",
  "internal_url": "http://127.0.0.1:3001/",
  "primary": false,
  "enabled": true,
  "current": false,
  "files_max_capacity": 10,
  "repos_max_capacity": 10,
  "verification_max_capacity": 10,
  "container_repositories_max_capacity": 2,
  "selective_sync_type": "organizations",
  "selective_sync_shards": [],
  "selective_sync_namespace_ids": [],
  "selective_sync_organization_ids": [
    1003,
    1004
  ],
  "minimum_reverification_interval": 90,
  "sync_object_storage": false,
  "web_edit_url": "http://127.0.0.1:3000/admin/geo/sites/2/edit",
  "web_geo_replication_details_url": "http://127.0.0.1:3001/admin/geo/sites/2/replication/ci_secure_files",
  "_links": {
    "self": "http://127.0.0.1:3000/api/v4/geo_sites/2",
    "status": "http://127.0.0.1:3000/api/v4/geo_sites/2/status",
    "repair": "http://127.0.0.1:3000/api/v4/geo_sites/2/repair"
  }
}
❯ # Reset back to no selective sync
curl --request PUT \
  --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
    "selective_sync_type": ""
  }' \
  "http://localhost:3000/api/v4/geo_sites/$SITE_ID"

# Verify the update
curl --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" "http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq
{"id":2,"name":"gdk2","url":"http://127.0.0.1:3001/","internal_url":"http://127.0.0.1:3001/","primary":false,"enabled":true,"current":false,"files_max_capacity":10,"repos_max_capacity":10,"verification_max_capacity":10,"container_repositories_max_capacity":2,"selective_sync_type":"","selective_sync_shards":[],"selective_sync_namespace_ids":[],"selective_sync_organization_ids":[1003,1004],"minimum_reverification_interval":90,"sync_object_storage":false,"web_edit_url":"http://127.0.0.1:3000/admin/geo/sites/2/edit","web_geo_replication_details_url":"http://127.0.0.1:3001/admin/geo/sites/2/replication/ci_secure_files","_links":{"self":"http://127.0.0.1:3000/api/v4/geo_sites/2","status":"http://127.0.0.1:3000/api/v4/geo_sites/2/status","repair":"http://127.0.0.1:3000/api/v4/geo_sites/2/repair"}}  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   801  100   801    0     0  15996      0 --:--:-- --:--:-- --:--:-- 16020
{
  "id": 2,
  "name": "gdk2",
  "url": "http://127.0.0.1:3001/",
  "internal_url": "http://127.0.0.1:3001/",
  "primary": false,
  "enabled": true,
  "current": false,
  "files_max_capacity": 10,
  "repos_max_capacity": 10,
  "verification_max_capacity": 10,
  "container_repositories_max_capacity": 2,
  "selective_sync_type": "",
  "selective_sync_shards": [],
  "selective_sync_namespace_ids": [],
  "selective_sync_organization_ids": [
    1003,
    1004
  ],
  "minimum_reverification_interval": 90,
  "sync_object_storage": false,
  "web_edit_url": "http://127.0.0.1:3000/admin/geo/sites/2/edit",
  "web_geo_replication_details_url": "http://127.0.0.1:3001/admin/geo/sites/2/replication/ci_secure_files",
  "_links": {
    "self": "http://127.0.0.1:3000/api/v4/geo_sites/2",
    "status": "http://127.0.0.1:3000/api/v4/geo_sites/2/status",
    "repair": "http://127.0.0.1:3000/api/v4/geo_sites/2/repair"
  }
}
❯ curl --request PUT \
  --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
    "selective_sync_type": "organizations",
    "selective_sync_organization_ids": ['$ORG_ID']
  }' \
  "http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   904  100   809  100    95   5306    623 --:--:-- --:--:-- --:--:--  5908
{
  "id": 2,
  "name": "gdk2",
  "url": "http://127.0.0.1:3001/",
  "internal_url": "http://127.0.0.1:3001/",
  "primary": false,
  "enabled": true,
  "current": false,
  "files_max_capacity": 10,
  "repos_max_capacity": 10,
  "verification_max_capacity": 10,
  "container_repositories_max_capacity": 2,
  "selective_sync_type": "organizations",
  "selective_sync_shards": [],
  "selective_sync_namespace_ids": [],
  "selective_sync_organization_ids": [
    1003
  ],
  "minimum_reverification_interval": 90,
  "sync_object_storage": false,
  "web_edit_url": "http://127.0.0.1:3000/admin/geo/sites/2/edit",
  "web_geo_replication_details_url": "http://127.0.0.1:3001/admin/geo/sites/2/replication/ci_secure_files",
  "_links": {
    "self": "http://127.0.0.1:3000/api/v4/geo_sites/2",
    "status": "http://127.0.0.1:3000/api/v4/geo_sites/2/status",
    "repair": "http://127.0.0.1:3000/api/v4/geo_sites/2/repair"
  }
}
❯ curl --request PUT \
  --header "PRIVATE-TOKEN: $PRIVATE_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
    "selective_sync_type": "organizations",
    "selective_sync_organization_ids": []
  }' \
  "http://localhost:3000/api/v4/geo_sites/$SITE_ID" | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   896  100   805  100    91   8953   1012 --:--:-- --:--:-- --:--:-- 10067
{
  "id": 2,
  "name": "gdk2",
  "url": "http://127.0.0.1:3001/",
  "internal_url": "http://127.0.0.1:3001/",
  "primary": false,
  "enabled": true,
  "current": false,
  "files_max_capacity": 10,
  "repos_max_capacity": 10,
  "verification_max_capacity": 10,
  "container_repositories_max_capacity": 2,
  "selective_sync_type": "organizations",
  "selective_sync_shards": [],
  "selective_sync_namespace_ids": [],
  "selective_sync_organization_ids": [],
  "minimum_reverification_interval": 90,
  "sync_object_storage": false,
  "web_edit_url": "http://127.0.0.1:3000/admin/geo/sites/2/edit",
  "web_geo_replication_details_url": "http://127.0.0.1:3001/admin/geo/sites/2/replication/ci_secure_files",
  "_links": {
    "self": "http://127.0.0.1:3000/api/v4/geo_sites/2",
    "status": "http://127.0.0.1:3000/api/v4/geo_sites/2/status",
    "repair": "http://127.0.0.1:3000/api/v4/geo_sites/2/repair"
  }
}

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Michael Kozono

Merge request reports

Loading