Geo: Reduce SSF boilerplate for upload partition replicators

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Close this issue

📋 Phase 1: Foundation & First Replicator (POC) | Risk: Low | View Epic &20933

Summary

This issue proposes improvements to the Geo Self-Service Framework (SSF) to reduce boilerplate code when adding new blob replicators, specifically for the 22 upload partition tables that need individual Geo replication support.

Background

Currently, adding a new blob type to Geo replication requires creating multiple files with nearly identical boilerplate code. For each new replicable, developers must create:

Replicator class (ee/app/replicators/geo/*_replicator.rb)
Registry model (ee/app/models/geo/*_registry.rb)
State model (ee/app/models/geo/*_state.rb)
Registry finder (ee/app/finders/geo/*_registry_finder.rb)
GraphQL resolver (ee/app/graphql/resolvers/geo/*_registries_resolver.rb)
GraphQL type (ee/app/graphql/types/geo/*_registry_type.rb)
Database migrations (registry table in geo DB, state table in main DB)
Database dictionary files
Factory files for testing
Spec files for each of the above
Manual registration in REPLICATOR_CLASSES array in ee/lib/gitlab/geo.rb
Manual registration in REGISTRY_CLASSES in registry_consistency_worker.rb
Manual updates to GeoNodeType GraphQL type
Manual updates to registrable_type.rb for resync/reverify support

With 22 upload partition tables to support, this means creating ~200+ files with mostly identical code.

Proposal: Dynamic Replicator Generation

Phase 1: Convention-based Auto-discovery

Replace manual registration with convention-based auto-discovery:

# ee/lib/gitlab/geo.rb
def self.replicator_classes
  @replicator_classes ||= discover_replicator_classes
end

def self.discover_replicator_classes
  Dir[Rails.root.join('ee/app/replicators/geo/*_replicator.rb')].map do |file|
    class_name = "Geo::#{File.basename(file, '.rb').camelize}"
    class_name.constantize
  end.select { |klass| klass < Gitlab::Geo::Replicator }
end

Phase 2: Dynamic GraphQL Registration

Auto-generate GraphQL types, resolvers, and fields based on registered replicators:

# ee/app/graphql/types/geo/geo_node_type.rb
Gitlab::Geo.replicator_classes.each do |replicator_class|
  field replicator_class.graphql_field_name,
        replicator_class.graphql_registry_type.connection_type,
        null: true,
        resolver: replicator_class.graphql_resolver_class
end

Phase 3: Base Upload Replicator for Partitioned Tables

Create a base class specifically for upload partition replicators:

# ee/app/replicators/geo/base_upload_partition_replicator.rb
module Geo
  class BaseUploadPartitionReplicator < Gitlab::Geo::Replicator
    include ::Geo::BlobReplicatorStrategy

    class << self
      # Subclasses only need to define:
      # - model (the upload model class)
      # - replicable_title / replicable_title_plural
      
      def registry_class
        # Auto-generate or use shared registry with partition key
        @registry_class ||= generate_registry_class
      end
    end

    def carrierwave_uploader
      model_record.retrieve_uploader
    end
  end
end

Phase 4: Shared Registry with Partition Discrimination

Instead of 22 separate registry tables, consider a shared registry approach:

# Single registry table with upload_type discriminator
create_table :geo_upload_partition_registries do |t|
  t.string :upload_type, null: false  # e.g., 'abuse_report', 'achievement', etc.
  t.bigint :upload_id, null: false
  # ... standard registry columns
  t.index [:upload_type, :upload_id], unique: true
end

This would allow a single Geo::UploadPartitionRegistry model with STI or type discrimination.

Phase 5: Generator Script

Create a Rails generator for new upload partition replicators:

bin/rails generate geo:upload_partition_replicator AbuseReport \
  --table=abuse_report_uploads \
  --sharding_key=organization_id

This would generate all necessary files with correct naming and configuration.

Implementation Checklist

Auto-discovery and Registration

Implement convention-based replicator class discovery
Remove manual REPLICATOR_CLASSES array maintenance
Auto-register registry classes in consistency worker
Auto-generate GraphQL fields on GeoNodeType

Base Classes and Concerns

Create Geo::BaseUploadPartitionReplicator base class
Extract common upload replicator logic into shared concern
Create shared registry concern for upload partitions

Database Optimization

Evaluate shared registry table vs individual tables
Create migration generator for registry/state tables
Auto-generate database dictionary files

GraphQL Automation

Dynamic GraphQL type generation from replicator metadata
Dynamic resolver generation
Auto-registration in registrable_type.rb

Testing Infrastructure

Shared examples that work with minimal configuration
Factory generator for new replicators
Automated spec generation

Documentation and Tooling

Rails generator for new upload partition replicators
Update issue template to reflect reduced manual steps
Document the new streamlined process

Benefits

Reduced code duplication: ~90% reduction in boilerplate files
Faster implementation: Adding a new upload type takes minutes instead of hours
Fewer errors: Less manual registration means fewer missed steps
Easier maintenance: Changes to common behavior only need to be made once
Better consistency: All upload replicators behave identically

Risks and Mitigations

Risk	Mitigation
Magic/implicit behavior harder to debug	Good logging, clear documentation
Performance of auto-discovery at boot	Cache discovered classes, lazy loading
Breaking existing replicators	Gradual migration, feature flags

Parent epic: &20933
MR: !221773 (closed)
Related issue: #227693 (closed) (Avoid maintaining REPLICATOR_CLASSES list)

Edited Feb 23, 2026 by 🤖 GitLab Bot 🤖