Proposal - Run internal review-apps manually
Context
We recently discussed the possibility of running review-apps manually (see slack discussion - internal), and I want to extend this discussion to a wider audience to make an informed decision.
Why this proposal? (as of 2023-09-14)
Engineering Productivity team is spending a lot of time/energy on keeping the GKE review-apps cluster stable and up-to-date (see the known incidents we have in our review-apps RUNBOOK), potentially for a lot of review-apps that are left unused (we do not yet have metrics on the review-apps usage - see the Open questions question 2. below)
When are review-apps automatically deployed?
- Review Apps are automatically deployed:
- on review-related CI changes
- on frontend config changes
- on
qa/changes - when the pipeline:run-review-app label is set
- Review-apps are currently blocking. This means that every time a review-app isn't properly deployed, we are blocking an MR from being merged.
- A review-app child pipeline takes around 40min to finish.
When are E2E tests run?
We are currently running E2E tests on review-apps, GDK and Omnibus.
On review-apps
We are automatically running E2E on review-apps for those jobs:
-
review-qa-smoke: always run in MRs -
review-qa-blocking-parallel: always run in MRs
On other platforms
The other platforms are GDK and Omnibus. and For instance, see the GDK QA jobs:
-
gdk-qa-smokeare always run -
gdk-qa-reliableare always run, but allowed to fail for the time being (this will probably change soon though) -
gdk-qa-non-blockingare run manually, and allowed to fail
The rules for omnibus package-and-test child pipelines are fairly complex to parse, so I won't detail them there.
The main takeaway is that we are testing the application via E2E tests in various places already.
Goal
Discuss with EP/Quality how the review-apps are currently used, and whether it would be a good idea to switch them from running automatically in MRs to manually (i.e. click on the start-review-app-pipelines trigger job in the UI, or use the pipeline:run-review-app label on the MR).
Open questions
- How relevant are E2E tests ran against review-apps? Do we use this data a lot?
- How many people do rely on review-apps? (I don't think we have a lot of data on this, although it's probably something we could extract from the GCP logs?)
For 2., I also think having a manual job and the pipeline:run-review-app label would be sufficient in practice.