Phase 4: Testing and Refinement of Conflict Resolver Agent

Overview

Comprehensive testing of the Conflict Resolver agent's autonomous resolution capabilities - verifying it can actually resolve conflicts by editing files, committing, and pushing.

Key Testing Focus

⚠️ Test AUTONOMOUS EXECUTION, not just suggestions:

  • Agent edits files correctly
  • Agent removes conflict markers
  • Agent creates valid commits
  • Agent pushes successfully
  • Resolved MRs are actually mergeable

Tasks

Test Scenario Creation

  • Create test MRs with simple conflicts (both modified same lines)
  • Create test MRs with multi-file conflicts
  • Create test MRs with complex logic conflicts
  • Create test MRs with renamed file conflicts
  • Create test MRs with binary file conflicts (should error gracefully)

Autonomous Execution Testing

File Editing

  • Verify agent calls edit_file with correct parameters
  • Verify conflict markers are completely removed
  • Verify resolution is applied correctly
  • Verify no syntax errors introduced
  • Verify file encoding preserved

Git Operations

  • Verify agent stages all resolved files
  • Verify commit message is descriptive
  • Verify commit includes all changes
  • Verify commit author is set correctly
  • Verify push succeeds to source branch

End-to-End Flow

  • User approves resolution plan
  • Agent edits files autonomously
  • Agent commits changes
  • Agent pushes to branch
  • MR becomes mergeable
  • Agent reports success with commit SHA

Approval Workflow Testing

  • Agent requests approval before executing
  • Agent shows clear plan of what will change
  • User can approve or decline
  • User can ask questions before approving
  • Agent only executes after explicit approval
  • Agent handles "no" gracefully (provides alternatives)

Error Handling Testing

File Operation Errors

  • Test: File is read-only (permission error)
  • Test: File is locked by another process
  • Test: Invalid file path
  • Test: File encoding issues

Git Operation Errors

  • Test: Push fails (network issue)
  • Test: Branch is protected
  • Test: Conflicts remain after resolution attempt
  • Test: Working directory not clean
  • Test: Authentication fails

Recovery Testing

  • Agent reports errors clearly
  • Agent suggests next steps
  • Agent can retry after fixing issue
  • User can manually complete if agent fails

Safety Testing

Branch Protection

  • Agent respects protected branch rules
  • Agent cannot force push
  • Agent respects push rules
  • Agent respects required approvals

Rollback Capability

  • Agent can revert its commit if user requests
  • Agent provides clear rollback instructions
  • Rollback doesn't break MR state

Audit Trail

  • All agent file edits are logged
  • All git operations are logged
  • User approvals are logged
  • Errors are logged with context

Conflict Resolution Quality

Resolution Accuracy

  • Simple conflicts resolved correctly
  • Multi-file conflicts resolved consistently
  • Logic preserved after resolution
  • No unintended side effects

Code Quality

  • No syntax errors introduced
  • Formatting preserved
  • Imports/dependencies intact
  • Tests still pass after resolution

Performance Testing

  • Measure file edit operation time
  • Measure commit creation time
  • Measure push operation time
  • Test with large files (1000+ lines)
  • Test with many files (20+ conflicts)

Integration Testing

Full Workflow

  • User clicks "Resolve with AI"
  • Chat opens with agent
  • Agent analyzes conflicts
  • Agent presents plan
  • User approves
  • Agent executes (edit, commit, push)
  • MR shows new commit
  • MR is mergeable
  • User can review commit

CI/CD Integration

  • Pipeline triggers after agent push
  • Tests run on agent's commit
  • Agent reports CI status
  • Agent suggests fixes if CI fails

Edge Cases

Complex Scenarios

  • Merge conflicts + failing tests
  • Conflicts in multiple branches
  • Conflicts with stale branches
  • Very old conflicts (100+ commits behind)

Boundary Conditions

  • Empty file conflicts
  • Single line conflicts
  • Entire file conflicts
  • Whitespace-only conflicts

Acceptance Criteria

  • Agent successfully resolves conflicts autonomously in >70% of test cases
  • All file edits are correct and complete
  • All commits are valid and pushable
  • All pushes succeed (when permissions allow)
  • Resolved MRs are actually mergeable
  • No security vulnerabilities introduced
  • Performance meets targets (<30s total for simple conflicts)
  • Error handling is graceful for all failure modes
  • Approval workflow works correctly
  • Safety mechanisms prevent destructive actions

Success Criteria

Must achieve:

  • ✅ 70%+ autonomous resolution success rate
  • ✅ <5% resolutions need correction
  • ✅ 0 force pushes or destructive actions
  • ✅ 100% approval requests before execution
  • ✅ Clear error messages for all failures

Test Results Documentation

  • Document success rate by conflict type
  • Document common failure modes
  • Document average execution time
  • Document user feedback on autonomous behavior
  • Document safety mechanism effectiveness

Prompt Refinement Based on Testing

  • Adjust confidence thresholds
  • Improve resolution strategies
  • Enhance error recovery
  • Optimize approval request clarity
  • Improve commit message generation

Files Changed

  • Agent configuration in AI Catalog (system prompt updates)
  • Test fixtures/data as needed
  • Test scripts for automation

Timeline

3-4 days (additional time for autonomous execution testing)

Related to epic &20688

Edited Feb 03, 2026 by Kai Armstrong
Assignee Loading
Time tracking Loading