Investigate performance problems with rule html_mailto
Summary
While setting up DAST full scans for https://gitlab.com/gitlab-org/gitlab/ we noticed that passive scan rule html_mailto
takes a long time to execute, which causes the DAST job to time out. Unfortunately, there is no apparent way to disable this rule (e.g. via DAST_EXCLUDE_RULES
).
It should be possible to disable this rule or to tweak its performance so that the DAST job does not time out.
Some notes from an initial investigation:
- This rule is implemented in
AutoTagRegexScanner
https://github.com/zaproxy/zaproxy/blob/65afe6af651cc789e96a3079edd86dd025ca8da1/zap/src/main/java/org/zaproxy/zap/extension/pscan/scanner/RegexAutoTagScanner.java. - This scanner loads the patterns to scan from a config file.
- The problem could lie in the regex that is applied to response bodies, which seems not very performant.
- There is a property called
enabled
which might allow us to turn off the rule.
Steps to reproduce
Configure a full scan against GitLab or any other application that returns large responses.
Example Project
https://gitlab.com/gitlab-org/gitlab/-/jobs/643427292#L4253
Look for log messages like this:
Passive Scan rule html_mailto took 86 seconds to scan https://gitlab-review-ng-dast-fu-m4ht1m.gitlab-review.app/assets/webpack/monaco.9ae6ac28.chunk.js application/javascript 2549011
What is the current bug behavior?
Passive rule html_mailto
takes a long time to execute, leading to DAST jobs timing out.
What is the expected correct behavior?
Possible solutions:
- Passive rule
html_mailto
takes less time to execute - There is an option to turn off rule
html_mailto
cc @ngeorge1 @gitlab-org/secure/dynamic-analysis-be