Skip to content

HEX regexps too permissive

Issue 1:

Regexps in the code rely on \d class, however this class matches all kind of digits, not only ascii ones:

$ raku -e 'say so "0x1๗a" ~~ m:i/^ 0x<[\da..f]>* $/'  # Thai "7" inside
True

Issue 2:

Quantifiers allow broken digit 0x:

$ raku -e 'say so "0x" ~~ m:i/^ 0x<[\da..f]>* $/'
True

Solution

My suggestion is to completly remove manual character classes from all regexps that are validating hex numbers and switch to built-in <xdigit> class. And fix quantifiers. For example:

m:i/^ 0x<[0..9a..f]>* $/ -> m:i/^ 0x<xdigit>+ $/

m:i/^ 0x<[\da..f]>**64 $/ -> m:i/^ 0x<xdigit>**64 $/

and so on.

Edited by Pawel Pabian