Selectively enable GZIP when HTTP referer matches external URL of GitLab host
gzip is disabled for HTTPS for a number of reasons, but Rails has anti-BREACH measures in place for CSRF tokens. In addition, we can mitigate the risk of this attack further by enabling GZIP only when the HTTP referer matches the GitLab origin.
For more details, see:
Merge request reports
Activity
added 103 commits
-
104a2c19...c7e9b8ae - 102 commits from branch
master
- e223a668 - Enable GZIP because Rails has anti-BREACH measures in place:
-
104a2c19...c7e9b8ae - 102 commits from branch
- Resolved by Stan Hu
added 86 commits
-
e223a668...d6ddb028 - 84 commits from branch
master
- bacb7334 - Enable GZIP because Rails has anti-BREACH measures in place:
- 5dbb862c - Turn on gzip only if the HTTP referer is from our hostname
-
e223a668...d6ddb028 - 84 commits from branch
assigned to @briann
- Resolved by Stan Hu
assigned to @stanhu
added 32 commits
-
8f9d407a...c84f79dc - 28 commits from branch
master
- 01ff62b3 - Enable GZIP because Rails has anti-BREACH measures in place:
- 59a95fb0 - Turn on gzip only if the HTTP referer is from our hostname
- 23406c54 - Only turn on referrer check for HTTPS
- c988f879 - Generate the HTTP_REFERER regular expression in recipe
Toggle commit list-
8f9d407a...c84f79dc - 28 commits from branch
added 1 commit
- 7b90b45b - Generate the HTTP_REFERER regular expression in recipe
assigned to @briann
added 1 commit
- 170bf51c - Simplify and clean up HTTP referer spec and comment
@stanhu I think that should work.
assigned to @stanhu
changed milestone to %10.0
assigned to @marin
151 151 location <%= path %> { 152 152 proxy_cache off; 153 153 proxy_pass http://gitlab-workhorse; 154 155 <% if @https && @referer_regex %> 156 if ($http_referer ~* (<%= @referer_regex %>)) { @stanhu I always prefer to avoid
if
when at all possible.From the NGINX documentation:
Directive
if
has problems when used in location context, in some cases it doesn’t do what you expect but something completely different instead. In some cases it even segfaults. It’s generally a good idea to avoid it if possible.https://www.nginx.com/resources/wiki/start/topics/depth/ifisevil/
@andrewn In this case, do you have a suggestion that would work better?
@stanhu unfortunately I don't, which is why I didn't suggest anything.
We could look for plugins which handle this, but that complicates the setup substantially.
However, I'm not totally convinced that we need this referrer protection for gzip.
I understand that an attacker could repeatedly send a victims browser to URLs which reflect back known input, for example
- https://gitlab.com/gitlab-org/gitaly/issues/new?issue[title]=112739128712983712893718237
- https://gitlab.com/gitlab-org/gitaly/issues/new?issue[title]=112739128712983712893718237123812837128937
- https://gitlab.com/gitlab-org/gitaly/issues/new?issue[title]=11273912871298371289371823723987239027328734
- etc
However, given the length of our tokens etc etc, the number of URLs that the attacker would need to generate would be huge - at least in the millions (I don't know the actual numbers, but I'm guessing it's very big). Given that the victim's browser would need to navigate each URL in turn (without the victim noticing, closing the attackers page and getting off their network), and that we have
X-Frame-Options: DENY
set to prevent the GitLab site being easily embedded within the attackers page, I very much doubt that anyone could successfully pull this attack off on GitLab.com.(I'm not sure if I'm missing a component of the attack vector, I may well be)
For this reason, I think we can probably enable gzip without referrer checking.
From https://blog.qualys.com/ssllabs/2013/08/07/defending-against-the-breach-attack:
Referer Check Mitigation
A quick, dirty, tricky, and a potentially unreliable mitigation approach you can apply today is to perform Referer header checks on all incoming requests. Because the attacker cannot inject requests from the web site itself (unless he gains access via XSS, in which case he owns the browser and has no need for further attacks), he must do so from some other web site (a malicious web site, or an innocent site hijacked from a MITM location). In that case, the referrer information will show the request originating from that other web site, and we can easily detect that.
cc @briann
@briann How do you feel about going back to the original commit that just enables gzip?
@stanhu Is "scared and confused" an acceptable answer?
I know @nick.thomas has also mentioned the dangers of
if
in nginx configs. Maybe he knows another way?the nginx documentation is clear that using these directives inside an if block is not guaranteed to be safe, so I think we shouldn't use it.
If we can't demonstrate that we're safe against the breach attack in a manner that's amenable to automated testing, I think we should leave gzip disabled.
Requiring a large number of requests to perform the attack is not, in itself, sufficient mitigation IMO.
Incredible to see a scripting language that doesn't handle if statements well. Although:
"It is important to note that the behaviour of if is not inconsistent, given two identical requests it will not randomly fail on one and work on the other, with proper testing and understanding ifs ‘’‘can’‘’ be used."
and there are specific examples of fail cases. Do they apply here? Probably just worth testing the possible outcomes of
if ($http_referer ~* (<%= @referer_regex %>))
and comparing them with the known fail examples?Also... are we the first set of people attempting to do this? :-)
Incredible to see a scripting language that doesn't handle if statements well. Although:
nginx configurations are not a scripting language, they're a configuration file. This is why
if
has never really worked in nginx configs and why they consider it "evil". They really don't want people using it any more.Also... are we the first set of people attempting to do this? :-)
It seems that most companies are not using this method of defence against compression side-channel attacks like BREACH.
For example, https://raw.github.com/dionyziz/rupture/develop/etc/Black%20Hat%20Asia%202016/asia-16-Practical-New-Developments-In-The-BREACH-Attack-wp.pdf (2016) claims Gmail and Facebook Chat are both susceptible as disabling compression is "is impractical in real-world systems" (presumably because of the cost to performance of doing this).
For what it's worth I've just tested this method against
gmail.com
from a cross-origin location (ie, a non-matching referer) and the response from google was gzipped. In other words, they've decided to not defend against this attack by disabling gzip for non-origin referrers.Another point
Since Rails is protecting the tokens by XORing them, the attack can only be carried out against secrets in the page itself (for example, reading the contents of a
README.md
file)However, it's worth pointing out that the page that is being attacked needs to reflect user input (as I showed in my example above) and as far as I know the only pages on Gitlab that are susceptible to this are pages like 'new issue' which doesn't have much in the way of good secrets on it.
In other words, the pages that are susceptible are also the ones that don't have worthwhile secrets.
Edited by Andrew NewdigateI've just tested on some more websites and none of them are disabling gzip for cross-origin referrers.
- https://www.facebook.com
- https://outlook.live.com
- https://github.com
- https://outlook.live.com
- https://www.citibank.com
- https://news.ycombinator.com
- https://www.amazon.com
We're pretty much on our own here. Every site I've tried has gzip enabled and doesn't disable it for cross-origin.
@andrewn I think to perform a proper check you'd need to find pages that reflect content. I've seen reference to these sites completely removing reflected content in order to provide compression.
I think to perform a proper check you'd need to find pages that reflect content. I've seen reference to these sites completely removing reflected content in order to provide compression.
Yes, quite right! Let's try this again.
-
https://www.google.co.uk/?q=gitlab
: content is reflected and response is compressed -
https://www.facebook.com/sharer/sharer.php?u=https://gitlab.com
: content is reflected and response is compressed
In both these cases, the referrer header was cross-origin and the rendered pages contain private information (eg, my identity)
So, either Facebook and Google and using a better way of dealing with compression side-channel attacks, or they're choosing to ignore it because of the tradeoff in not using compression vs. the difficulty in actually carrying out these style of attacks in the real world
-
@andrewn I get a mix of compressed and uncompressed. It's true that
https://google.com?q=
does compress the response. But performing a search onmail.google.com
does not. In fact, none of the responses containing mail data appear to be compressed.Static assets returned in the gmail search are compressed, but are served from a separate domain.
Browsing Facebook I see almost no compression for the text content or the results of searches.
@ernstvn I did some searching but couldn't find anything. Searching for anything about SSL, BREACH, and compression returns loads of garbage results, unfortunately.
So this is currently stuck with no clear next step on how to enable or what to check for? @briann
For reference, superseded by !2481 (merged)
mentioned in merge request !6862 (closed)
mentioned in issue gitlab#411295