Consider homoglyphs and IDNA transformations when preventing localhost access in UrlBlocker
Problem
Unicode in URLs can be interpreted differently, which can result in different DNS lookups occurring for different libraries or browsers.
In order to prevent SSRF we validate that a URL does not resolve to localhost. If a different DNS record is found by the library/service that actually makes the request to the one validating it then an attacker might be able to bypass this restriction.
For example http://faß.de is interpreted by some browsers as http://fass.de and others as https://xn--fa-hia.de/. An attacker might use this to register missile-attack.de and https://xn--miile-attack-m9a.de/ with only one pointing to localhost.
Ideas
We could verify that neither the original URI and the punycode uri.normalize equivalent resolve to localhost, or could restrict unicode in URIs.
Related
- https://gitlab.com/gitlab-org/gitlab-ee/issues/8719
- https://gitlab.slack.com/archives/C248YCNCW/p1543950508066300
Resources
- http://unicode.org/faq/idn.html#20
- http://www.unicode.org/reports/tr46/#Compatibility_Processing
- http://www.unicode.org/reports/tr39/#Restriction_Level_Detection
- https://www.blackhat.com/docs/asia-18/asia-18-Tsai-A-New-Era-Of-SSRF-Exploiting-URL-Parser-In-Trending-Programming-Languages_update_Thursday.pdf
- https://unicode.org/Public/idna/latest/IdnaMappingTable.txt
- https://www.plesk.com/blog/product-technology/what-is-the-problem-with-s/