Phase 4: Testing and Refinement of Conflict Resolver Agent
Overview
Comprehensive testing of the Conflict Resolver agent's autonomous resolution capabilities - verifying it can actually resolve conflicts by editing files, committing, and pushing.
Key Testing Focus
- Agent edits files correctly
- Agent removes conflict markers
- Agent creates valid commits
- Agent pushes successfully
- Resolved MRs are actually mergeable
Tasks
Test Scenario Creation
- Create test MRs with simple conflicts (both modified same lines)
- Create test MRs with multi-file conflicts
- Create test MRs with complex logic conflicts
- Create test MRs with renamed file conflicts
- Create test MRs with binary file conflicts (should error gracefully)
Autonomous Execution Testing
File Editing
-
Verify agent calls
edit_filewith correct parameters - Verify conflict markers are completely removed
- Verify resolution is applied correctly
- Verify no syntax errors introduced
- Verify file encoding preserved
Git Operations
- Verify agent stages all resolved files
- Verify commit message is descriptive
- Verify commit includes all changes
- Verify commit author is set correctly
- Verify push succeeds to source branch
End-to-End Flow
- User approves resolution plan
- Agent edits files autonomously
- Agent commits changes
- Agent pushes to branch
- MR becomes mergeable
- Agent reports success with commit SHA
Approval Workflow Testing
- Agent requests approval before executing
- Agent shows clear plan of what will change
- User can approve or decline
- User can ask questions before approving
- Agent only executes after explicit approval
- Agent handles "no" gracefully (provides alternatives)
Error Handling Testing
File Operation Errors
- Test: File is read-only (permission error)
- Test: File is locked by another process
- Test: Invalid file path
- Test: File encoding issues
Git Operation Errors
- Test: Push fails (network issue)
- Test: Branch is protected
- Test: Conflicts remain after resolution attempt
- Test: Working directory not clean
- Test: Authentication fails
Recovery Testing
- Agent reports errors clearly
- Agent suggests next steps
- Agent can retry after fixing issue
- User can manually complete if agent fails
Safety Testing
Branch Protection
- Agent respects protected branch rules
- Agent cannot force push
- Agent respects push rules
- Agent respects required approvals
Rollback Capability
- Agent can revert its commit if user requests
- Agent provides clear rollback instructions
- Rollback doesn't break MR state
Audit Trail
- All agent file edits are logged
- All git operations are logged
- User approvals are logged
- Errors are logged with context
Conflict Resolution Quality
Resolution Accuracy
- Simple conflicts resolved correctly
- Multi-file conflicts resolved consistently
- Logic preserved after resolution
- No unintended side effects
Code Quality
- No syntax errors introduced
- Formatting preserved
- Imports/dependencies intact
- Tests still pass after resolution
Performance Testing
- Measure file edit operation time
- Measure commit creation time
- Measure push operation time
- Test with large files (1000+ lines)
- Test with many files (20+ conflicts)
Integration Testing
Full Workflow
- User clicks "Resolve with AI"
- Chat opens with agent
- Agent analyzes conflicts
- Agent presents plan
- User approves
- Agent executes (edit, commit, push)
- MR shows new commit
- MR is mergeable
- User can review commit
CI/CD Integration
- Pipeline triggers after agent push
- Tests run on agent's commit
- Agent reports CI status
- Agent suggests fixes if CI fails
Edge Cases
Complex Scenarios
- Merge conflicts + failing tests
- Conflicts in multiple branches
- Conflicts with stale branches
- Very old conflicts (100+ commits behind)
Boundary Conditions
- Empty file conflicts
- Single line conflicts
- Entire file conflicts
- Whitespace-only conflicts
Acceptance Criteria
- Agent successfully resolves conflicts autonomously in >70% of test cases
- All file edits are correct and complete
- All commits are valid and pushable
- All pushes succeed (when permissions allow)
- Resolved MRs are actually mergeable
- No security vulnerabilities introduced
- Performance meets targets (<30s total for simple conflicts)
- Error handling is graceful for all failure modes
- Approval workflow works correctly
- Safety mechanisms prevent destructive actions
Success Criteria
Must achieve:
-
✅ 70%+ autonomous resolution success rate -
✅ <5% resolutions need correction -
✅ 0 force pushes or destructive actions -
✅ 100% approval requests before execution -
✅ Clear error messages for all failures
Test Results Documentation
- Document success rate by conflict type
- Document common failure modes
- Document average execution time
- Document user feedback on autonomous behavior
- Document safety mechanism effectiveness
Prompt Refinement Based on Testing
- Adjust confidence thresholds
- Improve resolution strategies
- Enhance error recovery
- Optimize approval request clarity
- Improve commit message generation
Files Changed
- Agent configuration in AI Catalog (system prompt updates)
- Test fixtures/data as needed
- Test scripts for automation
Timeline
3-4 days (additional time for autonomous execution testing)
Related to epic &20688
Edited by Kai Armstrong