Enable multi-core by default for Advanced SAST

See #514156 (comment 2310513496) for discussion that led to this issue.

The decision in %17.9 was to enable multi-core analysis as an opt-in feature, with on-by-default in %17.10. This timing is desirable because it allows us to improve customer outcomes sooner, while also getting this feature enabled further in advance of 18.0, when Advanced SAST will be enabled by default in the CI/CD template.

Goal

The core requirement is to enable multi-core operation by default, but to do so in a way that is safe.

Ease of operation

Ideally we would not require that users specifically do anything to take advantage of this capability. Generally, we want things to work well by default and not require too much tweaking; see, among others, our configuration principles.

For example, it would be ideal to automatically assess how many cores are present, and use that level of parallelism. However, it is acceptable to document a suggestion that large projects should configure their pipelines to use a larger runner size. (We should not change the default runner size for SAST jobs at this time.)

Safety needs

We specifically need to ensure that we do not cause reliability problems on the default runner for GitLab.com, which has 2 cores and 8GiB RAM. See hosted runners on Linux.

We can (and likely should) make conservative approximations to stay safe without user action. For example, we might set a rule that says "use up to 4 cores unless the user specifically tells us to use a different number" or "max out at 4 cores by default if we detect we are on a Kubernetes runner, regardless of what the machine says is available".

Considerations

There are cases when the number of available cores exceeds the number actually available. For example, Kubernetes runners may report that they have all the machine's CPUs available, even if they are subject to lower resource limits. See comment.
Using more parallelism imposes a memory cost, presenting a risk of OOM or memory stress if we schedule too many. See comment.
Users may expect that making large numbers of parallel instances will divide the runtime by that factor (e.g. running 1 instance takes time t, so 8 instances will take roughly t/8). But this is not true, because of the way the parallelism is implemented. We will likely find diminishing returns come with a lower number of cores.
- We should document this general fact (that adding n cores does not reduce runtime by a factor of n) in the documentation for this configuration option.

Edited Feb 05, 2025 by Connor Gilbert