Refractor/Replace bleach with nh3
What does this MR do and why?
This MR replaces bleach with nh3. Bleach has been/is being deprecated. While the maintainer is still providing security updates, I thought I'd give the option to switch to something else. Nh3 is not a full re-implementation of bleach, that does not exist at this time. Nh3 does implement clean() from the bleach library which is the only function crafty uses from bleach. Nh3 is not pure python but uses rust, it is faster than bleach.
Draft because I have not fully tested. I will try on several OSs. I also plan to fuzz the clean() function and take a look at nh3's code base to take a peak at their security compared to bleach.
How to set up and validate locally
Please take a look at This repo. Nh3 does evaluate some attacks differently. Of 539526 tested, 1593 were different from how bleach handled them. Differences.txt shows:
- The original string
- The bleached string
- The nh3ed string.
I'll need to spend more time evaluating but it appears nh3 pretty strictly only filters harmful html syntax while bleach goes for everything. More research will need to be done if one of these is actually safer or if they are equivalently safe in the context of crafty.
nh3 is significantly faster:
Bleach: 40.64 s
Nh3: 1.43 s
The speed difference does not affect crafty much, I'll save you a few milliseconds if you're interested.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
Have you checked this doesn't interfere/conflict/duplicate someone elses work? -
Have you fully tested your changes? -
Have you resolved any lint issues? -
Have you assigned a reviewer? -
Have you applied correct labels?