Engineering Discovery - Bring SAST and Secret Detection scanners to Core

The purpose of this issue is to perform engineering discovery in support of Bring SAST and Secret Detection scanners to Core and to document those findings.

Recommendation

Final recommendation is to bring all of the SAST analyzers (including Secrets) to Core at once.

Pro: We have approval to bring them all at once.
Pro: It will be less code/work to bring them all at once.
Pro: It will be less confusing to our customers in regards to which analyzers are in Core and which aren't.
Pro: We can turn off all the analyzers for Core within a day because we control the release cycle for the analyzers. [1]
Pro: We have a good idea of what we need to keep an eye on, performance wise.
Con: As opposed to the approach of just bringing a few analyzers initially, it increases the impact of the artifact download issue.

[1] In order to roll back the change, for the SAST analyzer, as well as each of our 14 analyzers, we would need to add code to main.go of each project similar to:

	if !strings.Contains(os.Getenv("SAST_DEFAULT_ANALYZERS"), "sast") {
	  os.Exit(3)
	}

Plan

Implementation Plan

Update SAST vendored template

[-] Add code to create $FEATURES environment variable from $GITLAB_FEATURES before it is unset so that we can use it as a feature flag of sorts in the case we need to rollback
[-] Remove check for /sast/ in features for DIND version
[-] Remove check for /sast/ in features for each of the analyzers for the DIND-free version. Example of line to change for bandit
[-] Add support for downloading artifact

Update documentation
Ensure rollback plan for SAST and at least one individual analyzer is viable and ready in case it is needed

Rollback Plan

Add check for /sast/ in $FEATURES to SAST analyzer and release new version
Add check for /sast/ in $FEATURES to each of the individual analyzers and release new versions of each as they are ready

Test Plan

Test analyzers as a core user

[-] Verify Capability table from Epic
[-] Test $SAST_DISABLE_DIND == 'true'
[-] Test $SAST_DISABLE_DIND == 'false'

Test analyzers as an ultimate user

[-] Verify Capability table from Epic
[-] Test $SAST_DISABLE_DIND == 'true'
[-] Test $SAST_DISABLE_DIND == 'false'

Findings

A test plan

A) I don't think testing will have to be too involved. But, if we only enable a few scanners, it may get more involved.
B) Make sure that the things that shouldn't show, don't show.
C) Make sure we aren't making calls for Secure report resources when we shouldn't - I'm not sure how/why/where we would, but good to verify.
D) Make sure artifact workaround doesn't break anything for Ultimate customers.

Backend code changes necessary to make this happen

A) I think we will just have to change the SAST vendored template and maybe the SAST analyzer. No need to change each individual scanner as far as I can tell.
B) We can easily turn on one scanner at a time for the DIND-less version. What would it take to do the same thing for the DIND-ful version? Maybe we could move the check of $GITLAB_FEATURES in to SAST itself? Note: See the Implementation plan for more details, but we can't rely on $GITLAB_FEATURES, but the general concept still holds.
i) We could maintain the list of analyzers we're bringing to Core in the SAST analyzer and if we do not find "sast" in $GITLAB_FEATURES, in the main function of main.go, we could do something like os.Setenv("SAST_DEFAULT_ANALYZERS", "bandit, brakeman") (assuming we're bringing bandit and brakeman to Core) to limit it to the Core analyzers.
C) Changes to the SAST analyzer can be released within a day, but changes to the SAST vendored template will take up to a week for gitlab.com and happen on the monthly release cadence for self-managed customers.
D) There is not a feature flag like capability for this. (I'd be happy to be wrong about this.)

Frontend code changes necessary to make this happen

A) I don't think there will be any frontend changes in the MVC.

Potential concerns or blockers

A) We can easily turn on one scanner at a time for the DIND-less version. What would it take to do the same thing for the DIND-ful version? Maybe we could move the check of $GITLAB_FEATURES in to SAST itself? Note: See the Implementation plan for more details, but we can't rely on $GITLAB_FEATURES, but the general concept still holds.
B) Will there be performance/load/stability concerns with the increase in usage of our scanners? We can take some steps to keep an eye on this:
i) We shouldn't have to store the data to the database because Core through Premium customers won't be able to access the data that way, but we need to verify that.
ii) We should also keep an eye on our S3 consumption, as all scans will at least generate artifacts that will have to be stored somewhere.
iii) We should add an issue to update Periscope to have metrics dedicated to Paying vs Free users. We should also take this opportunity to fix the current dashboard, which seems broken.
C) This will turn on for any customers that have SAST added to their .gitlab-ci.yml. AutoDevOps adds an entry for SAST, and up until now, leaving it there does no harm (or good) for Core through Premium customers. I don't know how to tell how many customers may have left an entry for SAST that will start running when we roll this out.
D) Will there be an easy way to turn this off? Not really. See 2. D) for more info.
E) The download "bug" with the workaround that Olivier has detailed
i) - Will that duplicate storage of artifacts and if it does, will that only be temporary and be something we can revert once the underlying "bug" is fixed?
ii) - If that is all correct, is that acceptable?
iii) - Maybe it will be necessary to only do a few scanners to start with along with this workaround? and by the time the workaround is no longer necessary, we'll be ready to turn on more scanners?
iv) - Related issues scheduled for %12.6 to keep an eye on: #12274 (closed) and #4189 (closed).
F) If we only turn on a subset of scanners and a repo on Core contains code for multiple languages that we support where some are covered for Core users and some are not, will that be confusing/misleading to the customer? They may think all their code is covered when only some code is.

Any necessary deprecation notices

A) Slightly unrelated, but can we update the SAST template now to better support moving Secrets scanning out of SAST? This change for DIND-less version should be very achievable, not so sure about DIND-ful version.

Detail documentation that will need to be updated

A) I think we should also do a blog post about working with our scanners in general, it wouldn't be specific to bringing SAST to Core, but as a result of bringing SAST to Core, a lot more folks will be using SAST, so it would be great to go over some of the things that aren't clear/obvious about using the scanners & templates. All the pieces are documented, but it isn't obvious what ways those pieces can be combined.
B) Indicate which scanners are covered in Core if we don't do all of them - should help with concern about us only scanning a subset of code in a Core repo.

PM will also need to produce the following deprecation notices for Secret Detection.

Secret detection will be decoupled from SAST, meaning customers who only include the SAST vendored template will lose secret detection capabilities (breaking change).
One of the underlying analyzer for Secret Detection, gitleaks, needs to be updated to the most recent version. This is also a breaking change.

Edited Dec 07, 2019 by rossfuhrman