SAST's secrets analyzer uses two upstream dependencies to do its work.
gitleaks
Trufflehog
We're currently shipping gitleaks v1.24.0. However, v3.3.0 shipped Feb 1, 2020. We need to update to the latest and greatest.
We're also shipping trufflehog v2.0.98. However, v2.0.99 shipped 6 May, 2019. We need to also update this dependency.
Addressing breaking changes
New versions of gitleaks remove one of the CLI options we rely on (see #12948 (comment 287252726)), so we will need to address the incompatibility issue with some additional logic to convert the passed ENV variable into a configuration option within gitleaks.toml
For Gitleaks, this involves a couple breaking changes we should address, including the deprecation of our entropy ENV flag (SAST_GITLEAKS_ENTROPY_LEVEL). It also now means we should now support either a user-provided TOML config, or fallback to one we package ourselves.
For Trufflehog, there's apparently no clear changelog between versions, which is a bit concerning but the version bump appears minor enough where it shouldn't be significant
@zrice just curious given this breaking change: what do you think would be the best path forward for us to continue supporting our ENTROPY setting? I suggested something like generating a toml config, but perhaps you have an insight into a better approach
Good question. I'm thinking we could preserve the use of https://docs.gitlab.com/ee/user/application_security/sast/#vulnerability-filtersSAST_GITLEAKS_ENTROPY_LEVEL by having the analyzer update the default toml used currently in the secrets analyzer. The secrets analyzer would append a single rule to the gitleaks toml which contains the entropy level (v3.0.0+ of gitleaks removes the entropy cli option).
Prior to 2.x of gitleaks entropy values were a single value and not a range. So SAST_GITLEAKS_ENTROPY_LEVEL will have to be a range.
However, I'm not convinced pure entropy is a good indicator of a secret. Combining entropy + regex could work but even with the current implementation in gitleaks it leaves a lot to be desired. I had an interesting PR opened recently in gitleaks which I think is a great way to really target generic credentials: https://github.com/zricethezav/gitleaks/pull/333. This PR introduces a change that would allow users to target entropy ranges based on regex groups.
Since the default value of SAST_GITLEAKS_ENTROPY_LEVEL was 8, that means the secrets detection should never detect a secret since that is the upper bound of the shannon entropy.
If we're not convinced there's value in preserving SAST_GITLEAKS_ENTROPY_LEVEL we could consider it a candidate for removal in %13.0, #207066 (closed). If so we should announce the deprecation in the current release in order to prep for that.
@twoodham@sethgitlab We have many tools that need periodic refreshes to the latest version. Do we have a single-source-of-truth today of what tool versions we have and what latest is?
@stkerr We do not, unfortunately. There is an issue there in that it would provide yet another place to keep things updated constantly between each subproject and the main docs, so there will never be a SSOT unless we explore an approach such as https://gitlab.com/gitlab-org/gitlab-ee/issues/11251. That issue was closed as no feasible approach was found to do so automatically.
To achieve this we could take the same manual steps of keeping the versions updated in the SAST page, which seems the most logical place IMO, but there will always be the added risk of things becoming out of sync
Got it. How do we know what current version of tools we currently use then? Is that determined by looking in the code base then someone noticing an update to the scanning project's home page?
If so, I'm concerned that we'll struggle to stay up-to-date efficiently, as efforts will be ad-hoc, rather than streamlined.
Got it. How do we know what current version of tools we currently use then? Is that determined by looking in the code base then someone noticing an update to the scanning project's home page?
Yes, that's currently determined by looking at the codebase.
If so, I'm concerned that we'll struggle to stay up-to-date efficiently, as efforts will be ad-hoc, rather than streamlined.
Yes, that's a concern and would be great to improve that process if we can find a time/cost-effective way of doing so
I have not had a chance to put any effort towards this issue and it is very unlikely this will ship in %12.4
The change to gitleaksinvolves breaking changes, but trufflehog does not. if there's any interest in doing so, we may still be able to update just trufflehog in %12.4. And then update gitleaks in %12.5.
@NicoleSchwartz - I'd like to push this item out to either %12.7 or %12.8, with %12.8 being preferred. I have @rossfuhrman chasing too many items and this one isn't getting any traction in %12.6. Any objections?
@zrice - handing this one over to you. I know this is wrapped up in the implementation plan for other issues, so I'm assuming you'll pick this one off naturally as part of your work there.
I'm also upping the weight assigned to this issue since the original weight of 2 was for a scale we are not using any more.