GLAS | Optimize multi-core scanning for balanced execution time

Current Situation

The multi-core feature currently splits the list of rules used by the engine based on the number of available cores (e.g., for 5 cores, rules are split into 5 groups). However, this splitting is based solely on the order of rules on the disk, which is not optimal for performance improvement.

Problem

This naive distribution can lead to significant imbalances in scan duration across cores, as highlighted in customer feedback (see: #514156 (comment 2327291172)).

Proposed Solution

Implement a cached artifact-based approach to store and utilize rule execution times for optimized distribution:

First Scan ("Alignment Scan"): Create timing stats Artifact
- Generate JSON file mapping rule IDs to execution times (rule ID -> milliseconds)
- Note: This initial scan will not be optimized
Subsequent Scans: Optimize Distribution
- Load cached timings during Lightz-AIO init
- Distribute rules based on execution times to balance load
- Share cache across scans via artifact publishing
- Improved performance and balanced execution times expected

Implementation Plan

1. Timing Collection

Use the stats published by the engine -testing flag
Collect per-rule scan timing data
Generate initial cache file

Note: Currently, the stats collection is not straightforward and requires parsing of the engine logs. We should implement a better way to collect those stats.

2. Cache Distribution

Research how exactly we can share a cache artifact between different GLAS scans
- We would like to keep it as a "hidden" artifact like the report artifact.

3. Distribution Algorithm

Load cache at startup
Sort rules by execution time
Use greedy distribution:
- Start with longest-running rules
- Assign to core with lowest total time
Fallback to current method if no-cache

4. Testing

Unit test cache operations
Performance benchmarking
Validate proper rule splitting based on the cached timing information

Edited Feb 18, 2025 by Mher Tolpin