Skip to content

Optimize HandleMalformedStrings middleware for CPU and memory

What does this MR do and why?

Previously HandleMalformedStrings always duplicated a string and checked whether it was valid UTF-8. However, this leads to unnecessary memory allocations and CPU usage.

We can significantly optimize this by only duplicating the string when we need to force the encoding to UTF-8.

How to set up and validate locally

The existing tests all pass, but you can see the performance improvements with:

benchmark.rb

### String Checks (Higher is better) ###

ruby 3.3.9 (2025-07-24 revision f5c772fc7c) [arm64-darwin24]
Warming up --------------------------------------
  Old - clean string   818.455k i/100ms
  New - clean string     1.257M i/100ms
Calculating -------------------------------------
  Old - clean string      8.203M (± 1.2%) i/s  (121.90 ns/i) -     41.741M in   5.088950s
  New - clean string     12.234M (±10.1%) i/s   (81.74 ns/i) -     60.337M in   5.008126s

Comparison:
  New - clean string: 12234428.9 i/s
  Old - clean string:  8203465.1 i/s - 1.49x  slower


### Long String (10k chars) ###

ruby 3.3.9 (2025-07-24 revision f5c772fc7c) [arm64-darwin24]
Warming up --------------------------------------
    Old - long clean    26.319k i/100ms
    New - long clean    72.311k i/100ms
Calculating -------------------------------------
    Old - long clean    254.537k (±13.4%) i/s    (3.93 μs/i) -      1.237M in   5.033226s
    New - long clean    723.368k (±20.7%) i/s    (1.38 μs/i) -      3.471M in   5.042592s

Comparison:
    New - long clean:   723368.0 i/s
    Old - long clean:   254536.6 i/s - 2.84x  slower


### Null Byte Detection ###

ruby 3.3.9 (2025-07-24 revision f5c772fc7c) [arm64-darwin24]
Warming up --------------------------------------
Old - null in middle   820.819k i/100ms
New - null in middle     1.846M i/100ms
   Old - null at end    19.003k i/100ms
   New - null at end    41.054k i/100ms
Calculating -------------------------------------
Old - null in middle      8.190M (± 1.1%) i/s  (122.11 ns/i) -     41.041M in   5.011969s
New - null in middle     18.062M (±10.1%) i/s   (55.37 ns/i) -     88.602M in   5.000721s
   Old - null at end    196.989k (± 6.7%) i/s    (5.08 μs/i) -    988.156k in   5.042065s
   New - null at end    412.956k (±24.3%) i/s    (2.42 μs/i) -      1.888M in   5.076560s

Comparison:
New - null in middle: 18061641.2 i/s
Old - null in middle:  8189596.6 i/s - 2.21x  slower
   New - null at end:   412955.6 i/s - 43.74x  slower
   Old - null at end:   196989.4 i/s - 91.69x  slower

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Stan Hu

Merge request reports

Loading