Research Spike: Security Orchestration Policy Architecture

Problem to Solve

We need an architecture design for Policies that support the following:

Policies will need to be supported at the Instance, Group, and Project levels. Technically the Instance is a parent group. We do not need to support the actual Instance (i.e.: server installation) level.
Policies will need to support all scan types (SAST, DAST, SCA, Fuzzing, plus anything else that we might add in the future).
Policies will need to support a two-step approval process for any changes.
Policies will need to support full audit logging of any changes made to the policies.
Both Scan Schedule and Scan Result type policies will need to be supported.
1. Scan schedule policies will need to support scheduling a policy as part of a pipeline, as part of a code commit, or just as part of a regular schedule (daily, weekly, etc.). Scan schedule policies will need to produce a scan results artifact that can be viewed. Ideally users will be able to optionally take other actions when a scan is started (send a slack message for example)
2. Scan result policies will need to evaluate the results of scans and take actions based on the findings. Some examples of Actions include creating an Alert in our Alerts Dashboard, sending a Slack message, or creating an issue in GitLab or Jira. This should be extensible to allow for more actions in the future. Scan results policies should be capable of looking at results from more than one scan job - for example, if SAST, DAST, and SCA are all run as part of the pipeline, ideally the results will be capable of being evaluated either independently or in aggregate.
Edits to policies need to EITHER result in commits to the database OR an MR to edit a file in a repo - not both. The current proposal at hand is to make the database the single source of truth. More details on that below.

Potential partial solution

One potential way to address requirement # 6 above is to create a new function owned by the Security Orchestration category where that function makes the determination of whether or not a scan needs to be run. That function should then be called everytime a pipeline job is run, regardless of the contents of the gitlab-ci.yaml file. To avoid running a job twice, the existing gitlab-ci.yaml scan parameters will no longer trigger a scan job. Instead, when a gitlab-ci.yaml file is edited, it will be parsed and updates pushed into the database as needed to ensure the database remains the single source of truth for whether a scan should or should not run as part of a pipeline job.

Note: Not all of these capabilities will be available for our initial MVC; however, we need to have an engineering architecture plan that supports the long-term direction.

Tasks to Evaluate

Determine feasibility of the feature
Evaluate whether to take advantage of existing scheduled pipelines feature and, if so, what modifications does it need due to the "two-step approval process" requirement.
Create issue for implementation or update existing implementation issue description with implementation proposal
Set weight on implementation issue

Risks and Implementation Considerations

Edited Nov 01, 2020 by Thiago Figueiró