Spike: real-time IDE SAST technical discovery

What

The purpose of this spike to explore the technical implementation details for each one of the following decisions, based on the summary, and other decisions already made with regard to this feature.

Run scans locally or via an API | Outcome
Run scans on demand (i.e. via a button) or trigger them via a key binding | Outcome
~~Run adapters/analyzers as a single executable or in a Docker container.~~
Performance characteristics for Cloudrun-based Scanners | Thread
Scanner Gateway architecture | Thread
~~Rollout Phases~~
IDE Integration | Thread

Background

As part of the technical discovery required for building a minimum viable change (MVC) for the real-time SAST scanning feature, there remains a few decisions that need to be settled. Below is a list of them, and some background on each.

1️⃣ Run scans locally or via an API.

We start with this one, because it's the most difficult decision given all the technical aspects involved with each approach, and their pros and cons. The aim here is to research:

For API-based scans

Building a new API endpoint that accepts a blob of code (ideally, the contents of a single file), scans it using an existing analyzer (i.e. semgrep), and returns a scan result, and how to have this integrated with the language server.

For local-based scans

Integrating an existing analyzer (i.e. semgrep) with the language server so that it's possible to pass file contents to the analyzer and get scan results back to display for the user.

2️⃣ Run analyzers as a single executable or in a Docker container.

Perhaps this is the easiest one, but as part of the discussion on using adapter pattern by the language server, several concerns were raised on whether to package the analyzer of choice (semgrep) as a single executable or to run this via Docker.

3️⃣ Run scans on demand (i.e. via a button) or trigger them via a key binding.

This is likely to be settled as a side-effect of the first one, since each trigger mode will likely work better one of the scan modes (local vs. API).

Edited May 01, 2024 by Lucas Charles