Spike: Refactoring Merge Request Approval Policies with Strategy Pattern

Background

Our current implementation of Merge Request Approval Policies is monolithic and handles violation detection in a complex way that makes debugging difficult, testing challenging, and extending functionality with new features problematic. The current code mixes multiple responsibilities (finding violations, logging, rule evaluation) in a single service, making maintenance increasingly difficult.

Current Implementation Analysis

Looking at the provided UpdateApprovalsService, we can see:

  • Complex rule evaluation logic mixed with data gathering and memoization
  • Limited separation between different rule types (scan_finding, license_finding, any_merge_request)
  • Challenging violation reporting with multiple intertwined methods and nested conditionals
  • Difficult error handling with multiple exit points and conditions

Problem Statement

We need to refactor our Merge Request Approval Policies implementation to:

  1. Clearly separate different rule types and their evaluation logic
  2. Provide consistent, detailed violation reporting
  3. Make it easier to add new rule types or modify existing ones
  4. Improve testability by allowing unit testing of individual rule strategies
  5. Maintain backward compatibility and performance
  6. Enable future expansion for advanced policy types, including Rego language policies

Expected Outcomes

Upon completion of this spike, we expect to deliver:

  1. Architecture Document: Detailed design of the new system, including class diagrams, interaction patterns, and data flow
  2. Proof of Concept: Implementation of at least one rule strategy (likely scan_finding) that demonstrates the viability of the approach
  3. Performance Analysis: Benchmarks comparing the current vs. proposed implementation
  4. Test Strategy: Documentation of how unit tests would be structured with examples
  5. Feature Flag Plan: Clear approach for safely deploying the changes behind a feature flag
  6. Migration Path: Step-by-step plan for transitioning from the current implementation to the new one
  7. Rego Integration Design: Initial design for future OPA/Rego integration showing how it would leverage the new architecture
  8. Risk Assessment: Identification of potential risks and mitigation strategies

Success criteria for the spike:

  1. Functional Equivalence: Demonstrated identical behavior between current and new implementations for key scenarios
  2. Performance Acceptance: Evidence that the new approach maintains or improves performance
  3. Simplicity Improvement: Quantifiable reduction in code complexity (measured by metrics like cyclomatic complexity)
  4. Extensibility Demonstration: Clear example of how a new rule type could be added with minimal changes
  5. Testing Improvement: Evidence that unit test coverage can be significantly improved

These deliverables will enable us to make an informed go/no-go decision on proceeding with the full implementation.

Proposed Architecture

Core Service

Create a new centralized service that orchestrates policy evaluation:

module Security
  module MergeRequestApprovalPolicies
    class EvaluationService
      def initialize(merge_request:, pipeline:)
        @merge_request = merge_request
        @pipeline = pipeline
        @policies = load_applicable_policies
      end
      
      def execute
        return unless should_evaluate?
        
        violations = collect_violations
        update_approval_rules(violations)
        
        # Return structured violations information for bot comments
        violations
      end
      
      private
      
      def should_evaluate?
        @pipeline.complete_or_manual? && @pipeline.can_store_security_reports?
      end
      
      def collect_violations
        @policies.map do |policy|
          policy_evaluator = policy_evaluator_for(policy)
          {
            policy: policy,
            violated: policy_evaluator.violated?,
            violations: policy_evaluator.violations,
            error: policy_evaluator.error
          }
        end.compact
      end
      
      # Other supporting methods...
    end
  end
end

Strategy Pattern Implementation

Create a base strategy class and specific implementations for each rule type:

module Security
  module MergeRequestApprovalPolicies
    module RuleStrategies
      class Base
        attr_reader :rule, :merge_request, :pipeline, :context
        
        def initialize(rule:, merge_request:, pipeline:, context: {})
          @rule = rule
          @merge_request = merge_request
          @pipeline = pipeline
          @context = context
          @error = nil
        end
        
        def violated?
          # Quick check if rule is violated
          raise NotImplementedError
        end
        
        def violations
          # Return detailed violations data
          raise NotImplementedError
        end
        
        def error
          @error
        end
        
        protected
        
        def set_error(error_type, details = {})
          @error = { type: error_type, details: details }
        end
      end
      
      class ScanFinding < Base
        def violated?
          return false if error_condition?
          
          # Implementation for security scan findings
          missing_scanners? || findings_count_exceeded?
        end
        
        def violations
          return [] unless violated?
          
          if missing_scanners?
            build_missing_scanners_violation
          else
            build_findings_violations
          end
        end
        
        private
        
        def error_condition?
          # Check for conditions that would prevent evaluation
          # Set error if needed
          false
        end
        
        def missing_scanners?
          missing_scanners.any?
        end
        
        def missing_scanners
          # Logic to detect missing scanners
        end
        
        def findings_count_exceeded?
          # Logic to check if findings exceed allowed count
        end
        
        # Other supporting methods...
      end
      
      class LicenseFinding < Base
        # Similar implementation for license findings
      end
      
      class AnyMergeRequest < Base
        # Implementation for any_merge_request rule type
      end
      
      class Rego < Base
        # Future implementation for Rego/OPA based policies
        def violated?
          evaluate_rego_policy
        end
        
        def violations
          return [] unless violated?
          
          build_rego_violations
        end
        
        private
        
        def evaluate_rego_policy
          # Logic to evaluate Rego policy against MR data
        end
        
        def build_rego_violations
          # Structure violations based on Rego evaluation result
        end
      end
    end
  end
end

Factory for Strategy Creation

module Security
  module MergeRequestApprovalPolicies
    class StrategyFactory
      STRATEGIES = {
        'scan_finding' => RuleStrategies::ScanFinding,
        'license_finding' => RuleStrategies::LicenseFinding,
        'any_merge_request' => RuleStrategies::AnyMergeRequest,
        'rego' => RuleStrategies::Rego
      }.freeze
      
      def self.for(rule, merge_request, pipeline, context = {})
        strategy_class = STRATEGIES[rule.type] || RuleStrategies::Base
        strategy_class.new(
          rule: rule,
          merge_request: merge_request,
          pipeline: pipeline,
          context: context
        )
      end
    end
  end
end

Policy Evaluator

An object to evaluate all rules within a policy:

module Security
  module MergeRequestApprovalPolicies
    class PolicyEvaluator
      attr_reader :policy, :merge_request, :pipeline
      
      def initialize(policy:, merge_request:, pipeline:)
        @policy = policy
        @merge_request = merge_request
        @pipeline = pipeline
        @rule_strategies = build_rule_strategies
      end
      
      def violated?
        @rule_strategies.any?(&:violated?)
      end
      
      def violations
        @rule_strategies.select(&:violated?).flat_map(&:violations)
      end
      
      def error
        errors = @rule_strategies.map(&:error).compact
        return nil if errors.empty?
        
        errors.first # Or combine errors if needed
      end
      
      private
      
      def build_rule_strategies
        context = build_context
        
        policy.rules.map do |rule|
          StrategyFactory.for(rule, merge_request, pipeline, context)
        end
      end
      
      def build_context
        # Build context for strategies
        {
          related_pipeline_ids: find_related_pipeline_ids,
          target_pipeline_ids: find_target_pipeline_ids,
          # Other context data...
        }
      end
      
      # Other supporting methods...
    end
  end
end

Rego/OPA Integration Benefits

A significant benefit of this architectural approach is the clean path it creates for implementing support for Open Policy Agent (OPA) with Rego policies. This planned future enhancement would:

  1. Enable Policy as Code: Allow customers to define complex approval policies using the Rego language
  2. Standardize Policy Definitions: Use an industry-standard policy language rather than proprietary formats
  3. Increase Policy Flexibility: Support complex conditional logic that may be difficult to express in current rule types
  4. Enable Policy Reuse: Allow policies to be shared across different platforms and systems

The strategy pattern makes this integration straightforward by:

  1. Creating a new Rego strategy class that handles policy evaluation
  2. Defining a standard interface for providing merge request data to the Rego engine
  3. Mapping Rego evaluation results back to our standard violation format
  4. Enabling customers to use both built-in rule types and Rego policies side-by-side

Without this refactoring, integrating Rego policies would require significant changes to our monolithic service, increasing the risk of regressions.

Structured Violation Data Format

To ensure consistency in violation reporting, define a standard format:

# Example violation structure
{
  rule_id: 123,
  rule_type: 'scan_finding',
  reason: 'findings_exceeded',
  details: {
    allowed: 0,
    found: 2,
    findings: [
      {
        uuid: '123abc',
        name: 'Cross-site scripting',
        severity: 'high',
        scanner: 'sast',
        newly_detected: true
      },
      # More findings...
    ]
  }
}

Migration Strategy

  1. Feature Flag Protection: Implement the new system behind a feature flag
  2. Parallel Implementation: Keep both systems running in parallel initially
  3. Comparison Testing: Validate that both systems produce identical results
  4. Gradual Rollout: Enable for increasing percentages of MRs
  5. Monitoring: Add detailed logging to compare old vs. new behavior

Key Technical Challenges to Address

  1. Data Access Optimization: Ensure we're not making redundant database queries
  2. Performance Parity: Benchmark against current implementation
  3. Context Sharing: Efficiently share contextual data between strategies
  4. Error Handling: Consistent approach to error conditions
  5. Test Coverage: Comprehensive test suite for each strategy
  6. OPA Integration: Design for future OPA integration without current implementation complexity

Success Criteria

  1. Functional Equivalence: New implementation produces identical results to current one
  2. Improved Testability: Unit tests for individual rule strategies
  3. Extensibility: Demonstrate adding a new rule type or exception case
  4. Performance: Equal or better performance metrics
  5. Clear Violations: Structured, detailed violation reporting
  6. Path to OPA: Clear design for future Rego policy integration

Timeline and Resources

  • Investigation and design: 1 week
  • Implementation of core structure: 1 week
  • Implementation of individual strategies: 1 week
  • Testing and validation: 1 week
  • Documentation and knowledge sharing: 0.5 week
  • (Future) OPA/Rego integration design: 1 week
  • (Future) OPA/Rego implementation: 2 weeks

Total for current refactoring: 4.5 weeks with 1-2 developers Future OPA integration: Additional 3 weeks

Next Steps

  1. Create detailed class diagrams for the proposed architecture
  2. Develop a proof of concept with one rule type
  3. Setup testing infrastructure for comparing old vs. new implementation
  4. Define feature flag strategy for safe deployment
  5. Create detailed migration plan with rollback options
  6. Draft initial design for Rego policy integration
Edited by Alan (Maciej) Paruszewski