Consider homoglyphs and IDNA transformations when preventing localhost access in UrlBlocker

Problem

Unicode in URLs can be interpreted differently, which can result in different DNS lookups occurring for different libraries or browsers.

In order to prevent SSRF we validate that a URL does not resolve to localhost. If a different DNS record is found by the library/service that actually makes the request to the one validating it then an attacker might be able to bypass this restriction.

For example http://faß.de is interpreted by some browsers as http://fass.de and others as https://xn--fa-hia.de/. An attacker might use this to register missile-attack.de and https://xn--miile-attack-m9a.de/ with only one pointing to localhost.

Ideas

We could verify that neither the original URI and the punycode uri.normalize equivalent resolve to localhost, or could restrict unicode in URIs.

Related

  • https://gitlab.com/gitlab-org/gitlab-ee/issues/8719
  • https://gitlab.slack.com/archives/C248YCNCW/p1543950508066300

Resources

  • http://unicode.org/faq/idn.html#20
  • http://www.unicode.org/reports/tr46/#Compatibility_Processing
  • http://www.unicode.org/reports/tr39/#Restriction_Level_Detection
  • https://www.blackhat.com/docs/asia-18/asia-18-Tsai-A-New-Era-Of-SSRF-Exploiting-URL-Parser-In-Trending-Programming-Languages_update_Thursday.pdf
  • https://unicode.org/Public/idna/latest/IdnaMappingTable.txt
  • https://www.plesk.com/blog/product-technology/what-is-the-problem-with-s/

Tools

  • https://cryptii.com/pipes/unicode-lookup
  • https://www.punycoder.com/
  • https://github.com/codebox/homoglyph/blob/master/raw_data/chars.txt
Edited Dec 05, 2018 by James Edwards-Jones
Assignee Loading
Time tracking Loading