Proposal - Run internal review-apps manually

Context

We recently discussed the possibility of running review-apps manually (see slack discussion - internal), and I want to extend this discussion to a wider audience to make an informed decision.

Why this proposal? (as of 2023-09-14)

Engineering Productivity team is spending a lot of time/energy on keeping the GKE review-apps cluster stable and up-to-date (see the known incidents we have in our review-apps RUNBOOK), potentially for a lot of review-apps that are left unused (we do not yet have metrics on the review-apps usage - see the Open questions question 2. below)

When are review-apps automatically deployed?

Review Apps are automatically deployed:
- on review-related CI changes
- on frontend config changes
- on qa/ changes
- when the pipeline:run-review-app label is set
Review-apps are currently blocking. This means that every time a review-app isn't properly deployed, we are blocking an MR from being merged.
A review-app child pipeline takes around 40min to finish.

When are E2E tests run?

We are currently running E2E tests on review-apps, GDK and Omnibus.

On review-apps

We are automatically running E2E on review-apps for those jobs:

review-qa-smoke: always run in MRs
review-qa-blocking-parallel: always run in MRs

On other platforms

The other platforms are GDK and Omnibus. and For instance, see the GDK QA jobs:

gdk-qa-smoke are always run
gdk-qa-reliable are always run, but allowed to fail for the time being (this will probably change soon though)
gdk-qa-non-blocking are run manually, and allowed to fail

The rules for omnibus package-and-test child pipelines are fairly complex to parse, so I won't detail them there.

The main takeaway is that we are testing the application via E2E tests in various places already.

Goal

Discuss with EP/Quality how the review-apps are currently used, and whether it would be a good idea to switch them from running automatically in MRs to manually (i.e. click on the start-review-app-pipelines trigger job in the UI, or use the pipeline:run-review-app label on the MR).

Open questions

How relevant are E2E tests ran against review-apps? Do we use this data a lot?
How many people do rely on review-apps? (I don't think we have a lot of data on this, although it's probably something we could extract from the GCP logs?)

For 2., I also think having a manual job and the pipeline:run-review-app label would be sufficient in practice.

Edited Sep 14, 2023 by David Dieulivol