Geo: Test project repository replication v2
Manual Test Plan: Geo Project Repository Replication V2
This test plan is AI generated, and pending manual review.
Summary
This issue tracks the manual testing plan for Epic #17974 - improving handling of projects without Git repositories in Geo replication.
Related MRs:
- !194051 - V2 Replication Architecture
- !198308 (merged) - Prevent Phantom Records Creation
Feature Flags:
-
geo_project_repository_replication(existing) -
geo_project_repository_replication_v2(new)
Problem Being Solved
Previously, Geo would attempt to replicate Git repositories for projects that don't actually have repositories, causing:
- "Project Repositories checksum failure" in the UI
- Sync failures with "Error syncing repository: 13:creating repository: cloning repository: exit status 128"
- False error reporting and wasted resources
Solution Overview
Two-pronged approach:
-
MR !198308 (merged): Only create
project_repositoryrecords when Git repositories actually exist -
MR !194051: Switch Geo replication to enumerate
project_repositoriestable instead ofprojectstable (V2 replication)
Test Environment Requirements
-
GitLab Geo setup with primary and secondary nodes -
Admin access to both nodes -
Feature flags available for toggling -
Access to Rails console on both nodes -
Ability to create projects with and without repositories
Test Scenarios
Phase 1: Basic Functionality Tests
1.1 Projects Without Repositories (Core Issue)
V1 Behavior (Before Fix):
# On primary - Rails console
project = Project.create!(name: "test-no-repo", path: "test-no-repo", namespace: user.namespace)
# This creates project_repository record even though no Git repo exists
project.project_repository # Should be present (problematic)
V2 Behavior (After Fix with !198308 (merged)):
# On primary - Rails console
project = Project.create!(name: "test-no-repo-v2", path: "test-no-repo-v2", namespace: user.namespace)
project.project_repository # Should be nil (correct)
Expected Results:
-
V1: project_repositoryrecord exists (legacy behavior) -
V2: No project_repositoryrecord created -
V2: No Geo replication attempt (no registry created) -
V2: No errors in secondary logs
1.2 Projects With Repositories (Should Work Both Ways)
# On primary
project = Projects::CreateService.new(user, {
name: "test-with-repo",
path: "test-with-repo",
initialize_with_readme: true
}).execute
Expected Results:
-
Both V1 and V2: project_repositoryrecord created -
Both V1 and V2: Successful replication to secondary -
No errors in logs
Phase 2: Feature Flag Switching Tests
2.1 V1 → V2 Migration
# Start with V1 enabled, V2 disabled
Feature.disable(:geo_project_repository_replication_v2)
# Create test projects (mix with/without repos)
5.times do |i|
if i.even?
# With repository
Projects::CreateService.new(user, {
name: "migration-test-#{i}",
path: "migration-test-#{i}",
initialize_with_readme: true
}).execute
else
# Without repository
Project.create!(name: "migration-test-#{i}", path: "migration-test-#{i}", namespace: user.namespace)
end
end
# Wait for V1 replication to complete
# Check registry state on secondary
# Enable V2 feature flag
Feature.enable(:geo_project_repository_replication_v2)
# Create more projects and verify behavior
Expected Results:
-
Existing V1 registries continue working -
New projects use V2 logic -
Projects without repos don't create registries in V2 mode -
No duplicate replication -
UI shows consistent counts
2.2 V2 → V1 Rollback
# Start with V2 enabled
Feature.enable(:geo_project_repository_replication_v2)
# Create projects and verify replication
# ...
# Disable V2 feature flag
Feature.disable(:geo_project_repository_replication_v2)
# Create new projects and verify fallback
Expected Results:
-
Existing V2 registries continue working via delegation -
New projects use V1 logic -
No replication interruption -
UI remains functional
Phase 3: UI and API Tests
3.1 Admin Geo Status Page
Test Steps:
- Navigate to
/admin/geo/sites - Check "Project Repositories" section in replication status
- Verify counts and status indicators
- Test with both feature flag states
Expected Results:
-
Accurate counts displayed for both V1 and V2 -
Status indicators work correctly (synced/failed/pending) -
No GraphQL errors in browser console -
Performance acceptable with large datasets
3.2 GraphQL API Compatibility
query {
geoNode {
projectRepositoryRegistries {
nodes {
id
projectId # Should always be present
projectRepositoryId # Should be present only in V2
state
lastSyncedAt
}
}
}
}
Expected Results:
-
projectIdfield always present (backward compatibility) -
projectRepositoryIdfield present only when V2 enabled -
No breaking changes for existing API consumers -
Proper error handling for edge cases
Phase 4: Error Scenarios and Edge Cases
4.1 Repository Deletion After Creation
# Create project with repository
project = Projects::CreateService.new(user, {..., initialize_with_readme: true}).execute
# Wait for replication
# Delete repository but keep project
project.repository.remove
# Trigger re-verification
Expected Results:
-
V1: May attempt to verify non-existent repo (current behavior) -
V2: Should handle gracefully, possibly remove registry -
No infinite retry loops -
Proper error messages in logs
4.2 Corrupt Registry Data
# Create registry with invalid project_repository_id
registry = Geo::ProjectRepositoryRegistry.create!(project_repository_id: 999999, project_id: project.id)
# Trigger replication worker
Geo::ProjectRepositoryReplicator.new(model_record_id: 999999).execute
Expected Results:
-
Graceful error handling -
No worker crashes -
Proper error logging
Phase 5: Performance Tests
5.1 Large Dataset Migration
# Create 100+ projects (70% with repos, 30% without)
100.times do |i|
if rand < 0.7
# With repository
Projects::CreateService.new(user, {
name: "perf-test-#{i}",
path: "perf-test-#{i}",
initialize_with_readme: true
}).execute
else
# Without repository
Project.create!(name: "perf-test-#{i}", path: "perf-test-#{i}", namespace: user.namespace)
end
end
# Enable V2 replication and monitor
Feature.enable(:geo_project_repository_replication_v2)
Performance Criteria:
-
Migration completes within reasonable time (< 10 minutes for 100 projects) -
Memory usage remains stable during migration -
No significant increase in database load -
Secondary site replication keeps up
Success Criteria
Must Pass ✅
-
All projects with repositories replicate successfully in both V1 and V2 -
Projects without repositories don't cause errors in V2 mode -
Feature flag switching works seamlessly in both directions -
UI shows accurate status and counts in all scenarios -
No regression in existing Geo functionality -
GraphQL API maintains backward compatibility
Performance ⚡
-
No significant performance degradation during normal operations -
Memory usage remains stable during feature flag switches -
Replication throughput maintained or improved
Error Handling 🛡️
-
Graceful handling of edge cases (missing repos, corrupt data) -
Proper error messages and logging (no cryptic failures) -
No infinite retry loops or worker crashes -
Clear recovery procedures for problematic states
Test Execution
Pre-test Checklist
-
Test environment prepared and verified -
Feature flags configured and accessible -
Baseline metrics captured (performance, error rates) -
Rollback plan prepared and tested
During Testing
-
Document all test results (pass/fail with details) -
Capture relevant log snippets for failures -
Monitor system performance metrics -
Screenshot UI states for documentation
Post-test
-
Compare performance metrics to baseline -
Document any workarounds or manual steps needed -
Verify rollback plan works if needed -
Prepare summary report
Risk Mitigation
High Risk Scenarios
-
Data Loss: Registry data corruption during feature flag switch
- Mitigation: Database backup before testing, staged rollout
-
Replication Backlog: Large queues during migration
- Mitigation: Monitor queue sizes, pause if necessary
-
UI Breakage: GraphQL schema changes break frontend
- Mitigation: Thorough GraphQL compatibility testing
Rollback Triggers
- Critical errors in replication
- Significant performance degradation (>25% slower)
- UI completely broken
- Data corruption detected
Test Results Summary
Overall Status: