Support exit codes in Security Policies

added Category:Security Policy Management GitLab Ultimate devopsgovern direction groupsecurity policies sectionsec typefeature labels

changed the description

mentioned in issue #464149

added to epic &13617

mentioned in epic &13371

@thiagocsf Would you have input on Policies establishing this more generally as a contract with analyzers? I discussed this a bit with @sam.white.

With Scan result policy is blocking MR when all DS a... (#464149) as an example of a single case, there are likely many others known and unknown. A few we came up with:

DAST requires application to be deployed to staging before it scans. Lots can go wrong. Has to be deployed, up and running, has to start/connect. Lots of interdependencies, lots of nuance.
SAST for languages that don’t have to compile is great. But some require compiled binary, how do we give that to analyzer, what if it can’t find it, what if code didn’t compile and not a security scan problem but code didn’t compile.
Repo without any code in a supported language, do you really want to run SAST if nothing to scan? Marketing/docs sites?

Perhaps for analyzers, success codes can also be produced to state essentially that the project has the proper/expected configuration or requirements to enforce scanning (e.g. all rules:exists are achieved) but there isn't any change that requires scanning.

This could allow policies to ensure enforcement only where necessary, dynamically based on the feedback from the analyzers. Today, however, the policy behavior can be pretty inconsistent as there are so many different behaviors across the analyzers.

I may be well out of my depth here, but raise it to see if we can establish a more reliable pattern to improve compatibility/integration with Secure. This may help us avoid squashing each bug one by one and to more dynamically support changes in analyzers. WDYT?

We could start iteratively with the Dependency scanning use case, but this could also give us a path to address other cases as the come up - the expectation being that we may want to introduce more error messaging in the analyzers if we encounter cases that aren't well handled.

cc @pcalder as we discussed compatibility briefly, @alan, @johncrowley, @connorgilbert, @smeadzinger and @sarahwaldner

Repo without any code in a supported language, do you really want to run SAST if nothing to scan? Marketing/docs sites?

@g.hickman For SAST or Dependency Scanning, where there is nothing to scan, it would be great to have a dummy, empty job that does not run any script but indicates that a given scan will be executed when specific files are available in the repository. We can then evaluate proper security policies with the information that a scan was configured in the CI configuration. For now, we assume that when there are no security artifacts, the scan was not performed, which is not valid in these cases. Perhaps we need to collaborate on this with devopsverify teams, as it involves extending the current CI configuration.

For other cases that you have specified, simple error codes would probably help, although we need to be consistent across analyzers, so we could handle some basic cases by default in security policies.

For SAST or Dependency Scanning, where there is nothing to scan, it would be great to have a dummy, empty job that does not run any script but indicates that a given scan will be executed when specific files are available in the repository. We can then evaluate proper security policies with the information that a scan was configured in the CI configuration.

@alan Is there a reason some status code couldn't be generated so we can consistently use this pattern? Is there an added benefit to a dummy job?

For now, we assume that when there are no security artifacts, the scan was not performed, which is not valid in these cases. Perhaps we need to collaborate on this with devopsverify teams, as it involves extending the current CI configuration.

Can you elaborate? I guess because we'd need CI to emit the message instead of the analyzers?

For other cases that you have specified, simple error codes would probably help, although we need to be consistent across analyzers, so we could handle some basic cases by default in security policies

Great points and feedback!

For SAST or Dependency Scanning, where there is nothing to scan, it would be great to have a dummy, empty job that does not run any script but indicates that a given scan will be executed when specific files are available in the repository. We can then evaluate proper security policies with the information that a scan was configured in the CI configuration.

@alan Is there a reason some status code couldn't be generated so we can consistently use this pattern? Is there an added benefit to a dummy job?

@g.hickman Status code can be generated only when the job is executed, although because of the exists rules in the template (https://gitlab.com/gitlab-org/gitlab-foss/-/blob/master/lib/gitlab/ci/templates/Jobs/SAST.gitlab-ci.yml#L197, https://gitlab.com/gitlab-org/gitlab-foss/-/blob/master/lib/gitlab/ci/templates/Jobs/Dependency-Scanning.gitlab-ci.yml#L61), when there is nothing to scan, a job is not executed, so we will not receive any status code.

For now, we assume that when there are no security artifacts, the scan was not performed, which is not valid in these cases. Perhaps we need to collaborate on this with devopsverify teams, as it involves extending the current CI configuration.

Can you elaborate? I guess because we'd need CI to emit the message instead of the analyzers?

We need to have a way to get a list of jobs (with information about their artifact types) that were not executed because they were filtered out by rules:exists. Essentially we want to know if the job is properly configured in the project's .gitlab-ci.yml or not. This is needed by both the security policies and Security configuration tab. To achieve it, we could emit some kind of message instead of analyzers or do something else to store a list of filtered-out jobs. That's why it requires collaboration with Verify because it involves modification of .gitlab-ci.yml or at least its parsing logic.

@alan Makes perfect sense, thanks!

Curious to hear any other feedback / opinions as well!

@alan @g.hickman

it would be great to have a dummy, empty job that does not run any script but indicates that a given scan will be executed when specific files are available in the repository.

Just on this specific point, I have these reactions:

Candidly, it feels like a hack. I can't think of why we would do this except as a side-channel signal about whether scans would run.
Users would (rightly) ask us questions about useless jobs showing up in their pipelines.
I'm not sure it solves the problem of an adversary tampering with a given SAST job or something; do we know that the dummy job succeeding means no other jobs were tampered with?

I might be misunderstanding the idea though.

Candidly, it feels like a hack. I can't think of why we would do this except as a side-channel signal about whether scans would run.

Users would (rightly) ask us questions about useless jobs showing up in their pipelines.

I'm not sure it solves the problem of an adversary tampering with a given SAST job or something; do we know that the dummy job succeeding means no other jobs were tampered with?

I agree and have same feelings about it, also it is using some CI minutes which is not desirable.

Just a thought here, it might be help to separate this out into two separate problems with two separate solutions.

Problem 1: As a AppSec or Compliance engineer, I need to identify projects that do not need scans to run (typically because certain files are absent) so that I don't unnecessarily enforce security policies against those projects.

As noted above, it does feel like exit codes would not be the ideal solution for this problem. It would be really nice if GitLab could look for certain files on push and classify projects based on their file contexts for easy reference. For example, could we tag projects that contain .js files as JavaScript - implying that it needs JavaScript SAST scanning? Similarly we could look for Dockerfile and requirements.txt and other common files to determine whether or not Container/Dependency Scanning are necessary.

Problem 2: As an AppSec or Compliance engineer, I need to easily troubleshoot and fix common scanner errors in projects that legitimately do need the scans to run so that I can keep my projects in compliance.

This seems like a good use of exit codes. Since the job is already running anyway, it won't cost anything extra to output an informative exit code. Right now our exit codes are not well standardized across the various analyzers. Our product would benefit a lot if we reviewed our most common failure reasons, classified them based on the type of action that the end user would need to take to fix it, and then standardized around a set of consistent error codes across the analyzers.
As an example:
- Exit code 2 = Project failed to compile due to prerequisite packages not available in the scanning image. Remediation: Install the required packages in a before_script (common failure reason for Dependency Scanning)
- Exit code 3 = Project failed to find the scan target. Remediation: make sure that the container image or binary exists at the specified path. (may exist with Container Scanning or any SAST analyzer that scans a compiled binary file as an input)
- Exit code 4 = App is not deployed to the expected URL. Remediation: Check that your review app is running and that the scan is pointed at the correct review app URL. (Potential issue with DAST, or API Security)

cc: @khornergit

Exit codes are a boring, albeit limited, solution that applies only to CI jobs. I think it's worth implementing and documenting their meaning, but I agree with @sam.white that they're not a good fit for explaining why a "scanner didn't run" (Problem 1).

I would like to explore the possibility of introducing an API that exposes build metadata as an improvement to integration between policies and analyzer results.

This would use the MR, rather than a CI pipeline, as context which means it could work with both CI-based analyzers and Continuous Vulnerability Scanning.

The content for the first implementation of the build metadata API would fit our own purpose: security scanner configuration and execution result.

These could take the form of an attestation or an "Analyzed SBOM" (see types of SBOM).

Either way, the policy verification process would use this API to obtain security scanning information:

Did it or did it not run?
If it didn't run, why?
If it did run, what were the settings? e.g.:
- What was scanned?
- What was it scanned for?
- What tools/versions were used for this?
- Where can the results be found?

What's more exciting is that it opens the possibility for custom integrations by users (e.g. the build metadata serves as audit records).

A CI scanner would pass this information to GitLab, while CVS would generate it internally.

WDYT, @g.hickman, @connorgilbert?

/cc @johncrowley

Just a thought here, it might be help to separate this out into two separate problems with two separate solutions.

@sam.white I think that's a good point. I did create Improve compatibility between security policies... (&14119) to explore options broadly and was considering exit codes as one potential approach to improving compatibility. But perhaps this epic splits into two difference cases.

As noted above, it does feel like exit codes would not be the ideal solution for this problem. It would be really nice if GitLab could look for certain files on push and classify projects based on their file contexts for easy reference. For example, could we tag projects that contain .js files as JavaScript - implying that it needs JavaScript SAST scanning? Similarly we could look for Dockerfile and requirements.txt and other common files to determine whether or not Container/Dependency Scanning are necessary.

Interesting thought - isn't that what rules:exists essentially does - determine when to run based on files existing? But perhaps the suggestion is that we do this in another way to classify projects. I could see additional value based on this as well. Customers definitely want help classifying projects and understanding which are more critical, when they need scans, etc.

I think policy scoping technically gives users a solution to scope projects in/out of enforcement of a given policy, so it's possible for a compliance or appsec team to classify which projects run Javascript and therefore should have Javascript SAST running, and then create a policy based on this. It just requires too much effort to do this upfront today at scale.

rules:exists exists already and is how our analyzers today determine when they should run, so I'm struggling to understand how we can't or shouldn't use that to some extent. Are we saying this isn't sufficient and we don't have a better way today?

If it's not the best way to do it, that's fair, but I'm trying to understand what we have that logic for if not to determine when scans should run

Problem 2: As an AppSec or Compliance engineer, I need to easily troubleshoot and fix common scanner errors in projects that legitimately do need the scans to run so that I can keep my projects in compliance.

It seems like we have consensus so far on this use case and it's benefits. I think policies could also use the outputs to make better decisions, and while it may not solve for all cases, it will help if and when significant customer issues surface (such as the DS rules:exists limit being exceeded).

Perhaps we can use this issue to carry the scope for exit codes further and split out discussion around Problem 1 in another issue/epic. I'll add Problem 2 in the description here.

Either way, the policy verification process would use this API to obtain security scanning information:

Did it or did it not run?

If it didn't run, why?

If it did run, what were the settings? e.g.:

What was scanned?

What was it scanned for?

What tools/versions were used for this?

Where can the results be found?

@thiagocsf This sounds like exactly the questions policies would like the answer to, whichever technical approach makes sense for us. I'm curious if there's a boring solution while we work towards a longer-term, ideal implementation? Offhand, I think a metadata API sounds super useful. Perhaps it wouldn't be too much effort if we agreed to go this route?

@g.hickman, I noticed @fcatteau also made a proposal in the parent epic. That suggestion is to examine the reports that the CI job should have, and respond accordingly.

My proposal above glosses over the intricacies of trusting the build process, which Fabien's addresses. I don't think our suggestions are mutually exclusive, but spiking these ideas is probably the best way to get some answers.

He and I talked about this in our 1:1 yesterday, and concluded that any scan execution based on the build can ultimately be tampered by developers in what becomes an arms race - we'll continue to play whack-a-mole with circumvention methods.

We could acknowledge that we can't guarantee execution, and focus instead on auditability. We document the caveats and consider them in our designs.

Build SBOM can be tampered with by the developer. Likewise, scan execution in the build can be tampered with by the developer. We make reasonable efforts to prevent this, but don't guarantee it.
Source SBOM and Analyzed SBOM when created outside of the build by trusted tools cannot be tampered with. E.g. SAST pre-receive secret detection, CA Continuous Vulnerability Scanning.

What I like about the build metadata API approach is that it can expose any combination of the above, with the required caveats when present.

E.g. 1: "this MR was scanned with tools X, Y, which found 0 vulnerabilities (low assurance)". The "low assurance" part could be because the only results are from a build SBOM -- which could be tampered with.

E.g. 2: "this MR was scanned with Z, which found 0 vulnerabilities (high assurance)". The "high assurance" is because Z relied on the Source or Analyzed SBOMs.

This comment is getting long, but there are still lots to consider (e.g.: how can we run scans outside of the build?). I'll stop for now and, if Sec thinks this idea holds water, I'd be happy to collaborate on a spike to flesh it out some more.

My proposal above glosses over the intricacies of trusting the build process, which Fabien's addresses. I don't think our suggestions are mutually exclusive, but spiking these ideas is probably the best way to get some answer

@thiagocsf That makes sense, I appreciate the thoughtful responses from both of you! Should we create spike issues in this epic or is that something you'd like to create on your end somewhere? This seems like a good approach and happy to collaborate.

concluded that any scan execution based on the build can ultimately be tampered by developers We could acknowledge that we can't guarantee execution, and focus instead on auditability

I think we should be able to consistently enforce scans to run using scan execution or pipeline execution policies. If other circumventions are there, we could caveat these in docs (such as tampering with SBOMs) and address those through separately. Maybe we can highlight at least a few of the challenges there and make sure we're on the same page about any limitations?

From a compliance standpoint, it can sometimes be enough to say that the scans successfully ran in a project and to be able to show proof (e.g. in a recent discussion with a customer they wanted to enforce DAST at least every 6 months in each project). It doesn't always make perfect sense, but that's compliance.

We also want to be sure we can do it in a way that doesn't disrupt downstream projects, so if we are more confident about when to run scans based on things like the files in the project or some project tags we can rely on, that does provide value to customers. If there is tampering, we could provide more tools for those cases in the future or find ways to audit and detect this behavior potentially.

which found 0 vulnerabilities (low assurance)

In this case, I'm unsure why we'd depend on this data at all if we feel the assurance is low. I'm not sure I'd feel comfortable adding that from a product standpoint as it signals you can't really rely on our analyzers, though I fully understand the point.

...

From a security policies standpoint, I think there is something we keep coming back to with MR approvals being able to depend only on results from jobs enforced by scan execution policies or pipeline execution policies, so this could also be something to POC. Perhaps it would work in combination with some ideas suggested, such as @fcatteau's in this thread. cc @alan

I also see merit in having some data from a metadata API.

I think largely this is all going in a great direction and appreciate the different ideas and opinions. Let's try for a spike (or two?) and see what we can come up with?

Perhaps we can use this issue to carry the scope for exit codes further

Pulling this from my comment earlier. Can we also agree with this? I don't think anything in your response above rules out the benefit of using exit codes, does it?

I think we should be able to consistently enforce scans to run using scan execution or pipeline execution policies.

[..]

We also want to be sure we can do it in a way that doesn't disrupt downstream projects

@g.hickman That's the dilemma really. As explained in Types of SBOM documents, SBOMs of build type are highly dependent on the build environment in which the build is executed. Right now this causes compatibility issues and it's disruptive. Dependency Scanning provides CI variables to configure the "build" instructions, but this is not flexible enough. Also, these CI variables are set for the entire group where the SEP is set, but projects might need different settings.

To address these compatibility problems, we have plans for letting users run the "build" instructions, that is the commands that export dependencies using build tools. (Dependency Scanning would only be responsible for processing these exports.) But then this requires cooperation from project maintainers who have write access to the CI config.

See #467039 (comment 1972839195) for details.

Again, we're only talking about SBOMs of build type.

Either way, the policy verification process would use this API to obtain security scanning information:

Did it or did it not run?

If it didn't run, why?

If it did run, what were the settings? e.g.:

What was scanned?

What was it scanned for?

What tools/versions were used for this?

Where can the results be found?

@thiagocsf This sounds like exactly the questions policies would like the answer to, whichever technical approach makes sense for us. I'm curious if there's a boring solution while we work towards a longer-term, ideal implementation? Offhand, I think a metadata API sounds super useful. Perhaps it wouldn't be too much effort if we agreed to go this route?

@g.hickman I agree, we already have security_scans to store information about performed scans. Currently we are creating this after pipeline is completed, perhaps we could create it before we execute pipeline, based on pipeline configuration and then analyzers would be ability to return information that could be stored there and then we could have some Metadata API as suggested.

From engineering standpoint, for this approach, we could start with Spike, collaborate with "devops::verify" and see what's currently possible and what requires additional work.

From a security policies standpoint, I think there is something we keep coming back to with MR approvals being able to depend only on results from jobs enforced by scan execution policies or pipeline execution policies, so this could also be something to POC. Perhaps it would work in combination with some ideas suggested, such as @fcatteau's in this thread. cc @alan

This is great, boring solution, that finally could connect MR Approval Policies with Scan Execution Policies. The effort to develop this is quite low, so we can definitely start with this one.

perhaps we could create it before we execute pipeline, based on pipeline configuration

I think it's worth exploring this approach, @alan. This information can be based on other sources too, like the policy itself (i.e. the MR widgets know ahead of time what is expected of the pipeline) and status of background tasks (e.g. license scanning).

@alan please let me know when you begin on a spike. I think we can help here as we are after a very similar thing with our upcoming work on Add compliance adherence requirement for each s... (&12661 - closed)

Thanks @fcatteau, a lot to take in there

I agree, we already have security_scans to store information about performed scans. Currently we are creating this after pipeline is completed, perhaps we could create it before we execute pipeline, based on pipeline configuration and then analyzers would be ability to return information that could be stored there and then we could have some Metadata API as suggested.

@alan Let's go with your proposal here as a starting point. We seem to have general consensus to spike it.

Can you create the spike and we can plan it?

We can separately continue in this issue to focus on handling the error codes.

@g.hickman, created Spike: Store analyzers results metadata to allo... (#471978 - closed). For now, I moved it to %17.4 for our team to take care of.

/cc @nrosandich

Thank you @alan!

mentioned in epic &14119

changed epic to &14119

changed milestone to %Backlog

marked this issue as related to #241342

I know there was a long discussion about standardizing analyzer exit codes at some point, and I remember posting a comment on such an issue... but I can't find it. Closest is Always generate secure scanning report regardle... (#241342).

mentioned in epic &12661 (closed)

mentioned in issue #471978 (closed)

added devopssecurity risk management label and removed devopssoftware supply chain security label

added security policygeneric label

Support exit codes in Security Policies

Release notes

Problem to solve

Intended users

User experience goal

Proposal

Further details

Permissions and Security

Documentation

Availability & Testing

Available Tier

Feature Usage Metrics

What does success look like, and how can we measure that?

What is the type of buyer?

Is this a cross-stage feature?

What is the competitive advantage or differentiation for this feature?

Links / references

Designs

Child items ...

Activity

Support exit codes in Security Policies

Release notes

Problem to solve

Intended users

User experience goal

Proposal

Further details

Permissions and Security

Documentation

Availability & Testing

Available Tier

Feature Usage Metrics

What does success look like, and how can we measure that?

What is the type of buyer?

Is this a cross-stage feature?

What is the competitive advantage or differentiation for this feature?

Links / references

Relates to

Activity