Phase 4: Testing and Refinement of Conflict Resolver Agent

Overview

Comprehensive testing of the Conflict Resolver agent's autonomous resolution capabilities - verifying it can actually resolve conflicts by editing files, committing, and pushing.

Key Testing Focus

⚠️ Test AUTONOMOUS EXECUTION, not just suggestions:

Agent edits files correctly
Agent removes conflict markers
Agent creates valid commits
Agent pushes successfully
Resolved MRs are actually mergeable

Tasks

Test Scenario Creation

Create test MRs with simple conflicts (both modified same lines)
Create test MRs with multi-file conflicts
Create test MRs with complex logic conflicts
Create test MRs with renamed file conflicts
Create test MRs with binary file conflicts (should error gracefully)

Autonomous Execution Testing

File Editing

Verify agent calls edit_file with correct parameters
Verify conflict markers are completely removed
Verify resolution is applied correctly
Verify no syntax errors introduced
Verify file encoding preserved

Git Operations

Verify agent stages all resolved files
Verify commit message is descriptive
Verify commit includes all changes
Verify commit author is set correctly
Verify push succeeds to source branch

End-to-End Flow

User approves resolution plan
Agent edits files autonomously
Agent commits changes
Agent pushes to branch
MR becomes mergeable
Agent reports success with commit SHA

Approval Workflow Testing

Agent requests approval before executing
Agent shows clear plan of what will change
User can approve or decline
User can ask questions before approving
Agent only executes after explicit approval
Agent handles "no" gracefully (provides alternatives)

Error Handling Testing

File Operation Errors

Test: File is read-only (permission error)
Test: File is locked by another process
Test: Invalid file path
Test: File encoding issues

Git Operation Errors

Test: Push fails (network issue)
Test: Branch is protected
Test: Conflicts remain after resolution attempt
Test: Working directory not clean
Test: Authentication fails

Recovery Testing

Agent reports errors clearly
Agent suggests next steps
Agent can retry after fixing issue
User can manually complete if agent fails

Safety Testing

Branch Protection

Agent respects protected branch rules
Agent cannot force push
Agent respects push rules
Agent respects required approvals

Rollback Capability

Agent can revert its commit if user requests
Agent provides clear rollback instructions
Rollback doesn't break MR state

Audit Trail

All agent file edits are logged
All git operations are logged
User approvals are logged
Errors are logged with context

Conflict Resolution Quality

Resolution Accuracy

Simple conflicts resolved correctly
Multi-file conflicts resolved consistently
Logic preserved after resolution
No unintended side effects

Code Quality

No syntax errors introduced
Formatting preserved
Imports/dependencies intact
Tests still pass after resolution

Performance Testing

Measure file edit operation time
Measure commit creation time
Measure push operation time
Test with large files (1000+ lines)
Test with many files (20+ conflicts)

Integration Testing

Full Workflow

User clicks "Resolve with AI"
Chat opens with agent
Agent analyzes conflicts
Agent presents plan
User approves
Agent executes (edit, commit, push)
MR shows new commit
MR is mergeable
User can review commit

CI/CD Integration

Pipeline triggers after agent push
Tests run on agent's commit
Agent reports CI status
Agent suggests fixes if CI fails

Edge Cases

Complex Scenarios

Merge conflicts + failing tests
Conflicts in multiple branches
Conflicts with stale branches
Very old conflicts (100+ commits behind)

Boundary Conditions

Empty file conflicts
Single line conflicts
Entire file conflicts
Whitespace-only conflicts

Acceptance Criteria

Agent successfully resolves conflicts autonomously in >70% of test cases
All file edits are correct and complete
All commits are valid and pushable
All pushes succeed (when permissions allow)
Resolved MRs are actually mergeable
No security vulnerabilities introduced
Performance meets targets (<30s total for simple conflicts)
Error handling is graceful for all failure modes
Approval workflow works correctly
Safety mechanisms prevent destructive actions

Success Criteria

Must achieve:

✅ 70%+ autonomous resolution success rate
✅ <5% resolutions need correction
✅ 0 force pushes or destructive actions
✅ 100% approval requests before execution
✅ Clear error messages for all failures

Test Results Documentation

Document success rate by conflict type
Document common failure modes
Document average execution time
Document user feedback on autonomous behavior
Document safety mechanism effectiveness

Prompt Refinement Based on Testing

Adjust confidence thresholds
Improve resolution strategies
Enhance error recovery
Optimize approval request clarity
Improve commit message generation

Files Changed

Agent configuration in AI Catalog (system prompt updates)
Test fixtures/data as needed
Test scripts for automation

Timeline

3-4 days (additional time for autonomous execution testing)

Related to epic &20688

Edited Feb 03, 2026 by Kai Armstrong

Assignee Loading

Time tracking Loading