Optimize HandleMalformedStrings middleware for CPU and memory
What does this MR do and why?
Previously HandleMalformedStrings always duplicated a string and checked whether it was valid UTF-8. However, this leads to unnecessary memory allocations and CPU usage.
We can significantly optimize this by only duplicating the string when we need to force the encoding to UTF-8.
How to set up and validate locally
The existing tests all pass, but you can see the performance improvements with:
### String Checks (Higher is better) ###
ruby 3.3.9 (2025-07-24 revision f5c772fc7c) [arm64-darwin24]
Warming up --------------------------------------
Old - clean string 818.455k i/100ms
New - clean string 1.257M i/100ms
Calculating -------------------------------------
Old - clean string 8.203M (± 1.2%) i/s (121.90 ns/i) - 41.741M in 5.088950s
New - clean string 12.234M (±10.1%) i/s (81.74 ns/i) - 60.337M in 5.008126s
Comparison:
New - clean string: 12234428.9 i/s
Old - clean string: 8203465.1 i/s - 1.49x slower
### Long String (10k chars) ###
ruby 3.3.9 (2025-07-24 revision f5c772fc7c) [arm64-darwin24]
Warming up --------------------------------------
Old - long clean 26.319k i/100ms
New - long clean 72.311k i/100ms
Calculating -------------------------------------
Old - long clean 254.537k (±13.4%) i/s (3.93 μs/i) - 1.237M in 5.033226s
New - long clean 723.368k (±20.7%) i/s (1.38 μs/i) - 3.471M in 5.042592s
Comparison:
New - long clean: 723368.0 i/s
Old - long clean: 254536.6 i/s - 2.84x slower
### Null Byte Detection ###
ruby 3.3.9 (2025-07-24 revision f5c772fc7c) [arm64-darwin24]
Warming up --------------------------------------
Old - null in middle 820.819k i/100ms
New - null in middle 1.846M i/100ms
Old - null at end 19.003k i/100ms
New - null at end 41.054k i/100ms
Calculating -------------------------------------
Old - null in middle 8.190M (± 1.1%) i/s (122.11 ns/i) - 41.041M in 5.011969s
New - null in middle 18.062M (±10.1%) i/s (55.37 ns/i) - 88.602M in 5.000721s
Old - null at end 196.989k (± 6.7%) i/s (5.08 μs/i) - 988.156k in 5.042065s
New - null at end 412.956k (±24.3%) i/s (2.42 μs/i) - 1.888M in 5.076560s
Comparison:
New - null in middle: 18061641.2 i/s
Old - null in middle: 8189596.6 i/s - 2.21x slower
New - null at end: 412955.6 i/s - 43.74x slower
Old - null at end: 196989.4 i/s - 91.69x slower
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Edited by Stan Hu