Increase bcrypt work factor to match current standards

added backend devopsmanage groupauthentication and authorization [DEPRECATED] maintenancerefactor security labels

added security-sp-label-missing security-triage-appsec labels

@dcouture can I get your thoughts on this? Thanks!

Pinging @gitlab-com/gl-security/appsec as @dcouture is out this week.

Thanks for the ping! @ali-gitlab I can't speak on behalf of Dominic, as Ethan mentioned he is out this week. However, I think this is a great idea and that it is worth trying to make time to do. I honestly don't have much to add, and it looks like James had created an issue a few months ago to address this.

I'm going to set the P/S labels as ~P4 / ~S4 for now, although I'm happy to reconsider that if anyone feels strongly that it should be something different.

It seems devise doesn't automatically recompute password hashes if hashing rounds are increased. There is a PR open for this https://github.com/heartcombo/devise/pull/5145. Until this PR lands, we might need to implement hash recomputation ourselves.

+1 on increasing the work factor! As Dennis said, it'll only apply moving forward (i.e. new self-managed installs, and when people sign up or change their password).

changed the description

added priority4 severity4 labels and removed security-sp-label-missing security-triage-appsec labels

added 1 deleted label

added sectiondev label

added 1 deleted label

added Epic 6340 label

added FY22Q3 label

@rshambhuni is this an enhancement or should it stay in the security backlog?

@mksionek This is a featureenhancement which I think we should work on soon but it doesn't need to be part of the security backlog.

@mksionek Removed S/P labels. Keeping the security label as the issue is security relevant. Added securitybotignore label to keep the SecurityBot happy. Also added featureenhancement label. Feel free to adjust the labels as you see fit.

added securitybotignore label

removed priority4 label

removed severity4 label

added featureenhancement label

added typefeature label

removed FY22Q3 label

cc @mloveless this issue is related to the crypto work you are doing

@mloveless This came through triage and is 3 years old now. Is this still relevant?

Absolutely, feel free to increase the work factor!

@mloveless Is that as simple as changing a value in a file, or is there more to it? Gotta rely on your knowledge here!

PS: Please @ mention me so I see your response

@hsutor In theory yes, as I recall it is as simple as changing a value in a file. However my only experience in doing this involves small and isolated systems, for something of scale this should be tested first. For example I don't know if this would require a password reset on everyone's password or not as I've certainly slept since the last time I did this type of thing. It is typically a function of the underlying OS in all scenarios I've personally dealt with before, this might be totally different in our environment in how it would need to be carried out.

Those are some good callouts, seems like we need to do some testing and possibly announce deprecation a breaking change if the outcome would be something that would change all passwords. I will leave this in our security backlog now, and don't have any plans to address it. Please let me know if you feel strongly about this one @mloveless

@hsutor I think we should do it but I am not feeling strongly about it. As long as it gets done within a few releases I'm more than happy with it.

As far as breaking change goes, I do remember we were discussing moving to PBKDF2 instead of bcrypt because of FedRAMP but decided the Okta route covered us. During those discussions someone (can't remember who, pretty sure someone clever on your team) said code could be added that would 1) determine if the hash used for the password at login was bcrypt or PBKDF2, 2) if bcrypt, verify the password as correct and rehash it to PBKDF2, 3) if PBKDF2 proceed as normal. We could potentially do the same for "old" bcrypt and "new" bcrypt and then it's not a breaking change.

One other thing we need to consider - if we are able to come up with the method above to allow for a change to bcrypt without disruption, then we could find ourselves being able to adapt to additional changes. I am not sure whether PBKDF2 is quantum safe, for example. There is movement to at least keep an eye on quantum cryptography partially because of FedRAMP, but also because others are doing the exact same thing.

@mloveless

I think we should do it but I am not feeling strongly about it. As long as it gets done within a few releases I'm more than happy with it.

We won't have the ability to take this on within the next few releases. @ankelly Would this be something I could tag for ProdSecEng Candidate? Understanding that it's only just for your team to take a look. We don't yet have a timeframe for this one.

@hsutor Yep no harm in adding the ProdSecEng Candidate label. That just lets us know that its something we could consider taking on. With this one I don't know that it will be a top priority for our team at this time, but its good to have it as something to potentially explore.

I've gone ahead and applied the label.

@ankelly BTW let me know how this progresses, rather curious.

added Category:System Access label

added to epic &10111

added security-backlognot-reviewed label

added security-backlogneeds-input label and removed security-backlognot-reviewed label

marked this issue as related to #370445 (closed)

added devopsgovern sectionsec labels and removed devopsmanage sectiondev labels

added groupauthentication label and removed groupauthentication and authorization [DEPRECATED] label

removed Epic 6340 label

changed milestone to %Backlog

marked this issue as related to #220580

added ProdSecEng Candidate label

mentioned in issue #370445 (closed)

FYI I closed Update BCrypt cost for old passwords, and defau... (#370445 - closed) as a duplicate, but it has a couple good threads on how to investigate a good work factor.

A bit of copy-pasta plus [clarifications]:

our server requirements are the primary constraint. An attacker is going to have access to any number of whatever AWS' latest multi-GPU-powered beasts are

Changing from 10 to 12 cost for BCrypt, if I understand correctly, would quadruple the total rounds (2^10 vs. 2^12). That's a significant improvement. And from what I see 10-12 is still a very acceptable cost factor.

I compared on AWS-based a1.large (2 core 4GB) and a1.xlarge (4 core 8GB) [to emulate GitLab on a Raspberry Pi]. Both gave [a target cost factor of] 11 [if you want the application to respond within] 250ms and 12 for 350ms

Changing the cost factor should then seamlessly apply to new passwords while existing will continue to compare against the old cost.

If we want opportunistic re-hashing of existing passwords on sign-in we can build that in like we did with PBKDF2+SHA512.

removed security-backlogneeds-input label

changed the description

I think it is relatively safe to update this to config.stretches = 12 to match upstream.

If we are worried, we can try introducing this with a feature flag. Though we will probably have to monkey patch Devise::Encryptor a little to be able to pass a different work factor into :BCrypt::Password.create depending on a feature flag.

added DataSecConsult label

assigned to @boconnor

I am going to take on writing this code, because it looks like we have broad agreement that we should improve this storage to current standards. In order to prevent hammering the inboxes of everyone who has commented on this ticket as I do the development and related work, I've created this new ticket in the DataSec board to track the work.

@boconnor Thanks for your interest in taking on this issue! Can you help me understand what the end-user facing impacts of this change will be? We have a formal process for deprecations and breaking changes, so we will need to follow those if the change is significant enough. breaking change are also very disruptive and we try to avoid them, so I'll be interested to hear if this would be classified as such. If there is a way we can still make this security enhancement without breaking change, it is much more likely that it can happen. Looks like Thong made a comment that it is "relatively safe" here but since being able to authenticate is critical to GitLab, I'd like to be sure.

I'd also like to hear from the AppSec/ProdSecEng team on how much of a security benefit this will be, so that we can decide if the benefit is worth the effort.

cc @adil.farrukh @nmalcolm @mloveless

@hsutor The main benefit is that we can claim we did it and we're "more secure". If we've lost a lot of customer deals due to the fact that a competitor has implemented it and we haven't, that addresses the concern.

Quickly putting on my threat modeling hat, I'd say from a security risk perspective the protection it provides involves password cracking which involves malicious access to the password hashes. The attacker would have to crack the password then attempt to authenticate and deal with whatever 2FA is in place. Granted if an attacker gains access to the password hashes, this implies a level of intrusion where the last thing we should be worried about is the hashes.

If it can be easily implemented without too much breaking change (note @nmalcolm's reference above to "opportunistic rehashing") I'd say we go ahead and do it, especially from a "de facto standard" perspective. But at this point I'd say we'd be in much better shape security-wise if we made 2FA standard for any and all accounts including the free ones on .com.

@hsutor This isn't a breaking change, and there will be no user-facing impacts. The workflow will be:

User enters their password.
Devise compares the submitted password to the stored, hashed password.
If the comparison succeeds and the stored, hashed password is not hashed at the new target work factor, re-store the submitted password using the new work factor.

So it's a one-way ratchet in concept; note that the existing (Devise) code will automatically pick up the work factor from a given stored password hash, so there is no problem with "old" and "new" passwords intermingling in the database.

The question of a pseudo-deprecation is a follow-on action. We could, for instance, decide that after a certain amount of time after the proposed code lands in production, we will mark all not-upgraded passwords as expired, and require reset-on-use, or reset-before-use (on an attempted login, it redirects the user to the forgot password flow). That, however, is a separate change from this one, and (as you note) requires a much more extensive discussion with comms, product, and other stakeholders.

But at this point I'd say we'd be in much better shape security-wise if we made 2FA standard for any and all accounts including the free ones on .com.

I would welcome this! However, it's a separate change that I'm not proposing making myself at this time.

por que no los dos?

we shouldn't scope creep @boconnor here or stop this work in progress, but making 2FA standard for all accounts is a high impact idea to also pursue. @mloveless if there isn't already an issue logged for MFA All The Things , would you do the honors of creating one so the idea isn't lost in this thread?

@hsutor

I was delighted to find that the opportunistic upgrade functionality already existed in the code, thanks to the hard work done by the PBKDF2 implementation team. Accordingly, then, I was able to make a quick extension to that functionality to check if the bcrypt work factor (stored as stretches in the Devise system) in a stored password was different than the new target. I tested it in my local GDK and it upgraded the user's stored password hash transparently on login. It is quite short:

(Please forgive the screenshot in lieu of a standard diff; I'm fighting a bit with the GDK in a box, and so I deleted the branch I pushed to the wrong place rather than leaving it up over a holiday.)

Please note that this code is not complete, because it doesn't handle the PBKDF2 use case, and we will of course want to add additional checks, tests, etc. (This is why I didn't generate an MR.) That said, I hope that this gives a sense of the flow I'm hoping to achieve.

In addition, I intend to do some testing on equivalents of our production servers to check if 14 is the correct work factor for our needs. This is a benchmarked question; my understanding is that one should target appx. 1s as the time to generate a hash (as an output of UX studies). I intend to bring both the citations for that assertion and the data to the eventual MR, because I don't want to make a change like this based just on my faulty memory. :-)

@eelrebmyk You can read the latest on our 2FA by default efforts on this thread. It will give you an idea of who is involved and the work done so far.

@boconnor Thanks for the updates! Since this is in active work, I'll put it in the current milestone with a workflowin dev label. With no breaking change , we don't need to go through the processes for that. This should probably be put behind a feature flag, @adil.farrukh can you confirm?

FYI re company wide 2FA, a placeholder/capture issue was created for the curious: https://gitlab.com/gitlab-com/gl-security/security-department-meta/-/issues/1917+

@hsutor

This should probably be put behind a feature flag, @adil.farrukh can you confirm?

Yes, as this touches access, we should add a FF, and enable it internally (for a few GitLab team member groups) before rolling it out to production.

Will do.

changed milestone to %17.8

added workflowin dev label

mentioned in issue #511397

This ticket has the results of a quick bcrypt benchmark on our two types of production systems for .com, along with the code I used. The way to pick a work factor time I have used in the past is that it should be the highest work factor that's less than one second per login, in which case the answer is a work factor of 14 (906ms/hash); however, I could also see an argument for 13 (453ms/hash). (As I understand it, the UX reason for "one second" is that's roughly the line below which the hash time won't dominate the login process; if it takes longer than that, the login process "feels slow.")

As a point of reference, I understand that a standard human eye blink takes about 200ms.

@hsutor do you have a preference, opinion, or alternately, a pointer to any existing UX research on login speed we might already have? (We can also adjust this during the feature flagged time, of course, so the decision is set more in Play-Doh than stone.)

I've opened this MR with this change. @hsutor I know you have reviewers in mind, and I welcome your and their feedback!

mentioned in merge request !177154 (merged)

marked this issue as related to #511397

added #520160 as child task

changed milestone to %17.10

mentioned in merge request !182474

Increase bcrypt work factor to match current standards

Action items

Designs

Child items ...

Activity

Increase bcrypt work factor to match current standards

Action items

Relates to

Activity