Parsing comments in <style> content (patch attached)
Hello, currently parsing html-file content fails if <
-symbols occur in <style>
content.
Command line to reproduce:
wget2 -m --max-threads=1 --content-disposition --regex-type=pcre --accept-regex="www\.3gpp\.org/DynaReport/23.*?\.htm|portal\.3gpp\.org/desktopmodules/Specifications/SpecificationDetails\.aspx\?specificationId=|portal\.etsi\.org/webapp/workprogram/Report_WorkItem\.asp\?WKI_ID=|www\.etsi\.org/deliver/etsi_ts/.*?\.pdf" --domains="portal.etsi.org" --span-hosts --filter-urls https://www.3gpp.org/ftp/Specs/html-info/23-series.htm
Parsing and following of <a ... href=23XXX.htm> links are expected.
Patch with proposed fix is attached: 0001-Fix-parsing-comments-in-style-content.patch
Edited by Sergei Litvin